Project description
1 6G4Z1001 Programming RESIT Assignment
1.1 Specification
The field of text analysis makes use of many different tools in order to extract information about written works. One
of these methods is word length frequency, which is used to assess patterns in writing (for example, one author might
prefer to use short, snappy words, whilst another might prefer longer words). Such information can be useful in helping
to identify (or rule out) the author of an anonymous piece of text. In its simplest form, frequency analysis calculates
the number of words of length 1, 2, 3, ¦, n, where n is the length of the longest word. For example, analysing the
sentence I am a man’ would produce the output 2, 1, 1, because there are
¢ 2 words of length 1, I’,a’;
¢ 1 word of length 2, am’;
¢ 1 word of length 3, man’.
This assignment requires you to produce a Java program which performs a text analysis, calculating a range of statistics
on a piece of text. You must also document the testing of the program. The specification for the assignment is as
follows:
1. The program is a stand alone Java application (not an applet) that is it has a main method.
2. Its initial state is in a while loop, you enter text from the command line/terminal (or however your Java IDE
accepts text input).
3. Your assignment submission must include a Java file and a PDF file, the PDF file contains a pseudocode
construction of the algorithm used to solve the problem and also a range of test cases.
4. On moodle there is a file start.java, you can use this as a starting point for the assignment.
2 Useful Java Functions
The following String functions:
1. split (see code listing 4);
2. charAt;
3. indexOf.
are very useful in this assignment. Examples of their usage appear in the notes and online.
2.1 Assessment Levels
In each case, achievement at a given degree level requires that all lower levels also be fulfilled (so, for example, a
student aiming for a mark of 65% must complete all of both the 40-49% and 50-59% criteria before addressing the
criteria in 60-69%). We strongly advise you to complete, test (on the laboratory computers) and save your assignment
at each level before you go on to a higher one. In this way you can be sure to have an earlier version to submit, even if
your final, complex, version has an error as the deadline approaches. In each case, the criteria listed are the minimum
requirements for that mark band. Additional marks within a band can be obtained by improving the quality of your
code, or implementing some of the features required for the next level.
MMU 2 CMDT
6G4Z1001 Programming NPC Assignment 1
2.1.1 pass (40%)
1. Compiles without error.
2. User can enter a line of data at the prompt/terminal.
3. The program displays the line the user entered.
4. The program counts the the frequency of each word length in the line of data and displays it on the termi-
nal/prompt.
5. The program loops back to 2 and waits for another line to be typed; this is an infinite while loop (code is provided
for the loop in the listing in section 4).
6. The submitted PDF contains mention of the pseudocode used to approach the problem and evidence of some
simple test cases.
2.1.2 Third (40-49%)
1. All earlier criteria are meet.
2. The mean word length is evaluated and displayed at the prompt.
3. The algorithm does not count grammar terms to be part of the word, so
the cat sat on the mat!
consists of 1 word of length 2 and 5 words of length 3. That is mat is a 3 letter word not a 4 letter word. The
algorithm processes exclamation marks, questions marks, full stops and commas correctly.
4. The submitted PDF contains a complete set of test cases.
2.1.3 2ii (50-59%)
1. All earlier criteria are meet.
2. The information is displayed in bar chart form, the y axis denotes the word length, the x axis its frequency.
3. The visual appearance of the bar chart scales in an intelligent manner, to expand on this, consider the two
sentences:
¢ I say by the way’
¢ antidisestablishmentarianism is a good idea’
For the first sentence the bar chart on the y axis goes from 1 to 3, whilst the bar chart on the y axis for the
second sentence goes from 1 to 28. Your graphical output should ensure that in the first case the bar width
is approximately 1
3 on the screen, whilst in the second it should be 1
28 ; that is whatever the data set is the
algorithm ensures the bar chart occupies a large part of the drawing window. The same argument applies to the
scaling of the height of the bar chart as well.
MMU 3 CMDT
6G4Z1001 Programming NPC Assignment 1
2.1.4 2i (60-69%)
1. All earlier criteria are meet.
2. In addition to being able to enter text in the textfield the user has an option to load a text file, from the program
prompt:
load filename
will process the text file called filename, that is load is a keyword, when the program meets it it needs to process
it as a special command not as a data line.
3. The analysis is displayed correctly in bar chart form.
4. The visual representation of the data scales dynamically with the data set, see criteria 2.1.3-3.
5. The code is well-structured and uses parameterised methods (of your own creation) in an intelligent manner.
2.1.5 i (70-79%)
1. All earlier criteria are meet.
2. Another keyword swap is implemented it takes two arguments, two characters to be swapped. In all text entered
after the swap command, the display shows the two characters switched.
2.1.6 80+
1. All earlier criteria meet to an exceptionally high standard.
3 Trial Session
We present some trial output of a working program for a Mr Keith Yates with i.d 123456 . Note the code just illustrates
aspects of the specification. The use typed the cats sat on the mat’, then
1. the program echoes back the line, required by criteria 2.1.1-3
2. the program displays the frequency of the word lengths, required by criteria 2.1.1-4
3. the program displays the frequency of word lengths in the form of a bar chart, required by criteria 2.1.3-2
java Yates Kei th 123456
the c a t s s a t on a mat
the c a t s s a t on a mat
There a r e 0 words o f l eng th 0
There a r e 1 words o f l eng th 1
There a r e 1 words o f l eng th 2
There a r e 3 words o f l eng th 3
There a r e 1 words o f l eng th 4
There a r e 0 words o f l eng th 5
Length 0
Length 1
Length 2
Length 3
Length 4
Length 5
MMU 4 CMDT
6G4Z1001 Programming NPC Assignment 1
4 Skeletal Code
Skeletal code’
import java . u t i l . Scanner ;
public clas s Yates Kei th 123456 {
public s tat ic void main ( St r ing a r g s [ ] ) 5
{
Scanner scan = new Scanner ( System . in ) ;
St r ing l i n e= ;
while ( ( l i n e=scan . nextLine ( ) ) != nul l )
{
System . out . p r i n t l n ( l i n e ) ;
}
} 15
}
5 Submission Procedure
The following are to be submitted via Moodle by the deadline:
1. A Java file Surname_Forename_grpNumber.java. In the Java file there will be a class which has the same
name as the Java file.
2. An PDF file Surname_Forename_grpNumber.pdf.
Note that submissions which are misnamed, incomplete or do not compile will be subject to penalty. Conformity
to the submission requirements will be enforced by the following rules (note that these penalties may be combined):
¢ Misnaming will incur a penalty of 5%, deducted from the mark which would otherwise have been given.
¢ Each alteration not relating to names, required to compile the code, will incur a 5% penalty. If five or more
lines must be changed, the submission will attract a failing mark regardless of the mark it would obtain if they
were corrected.
¢ Incomplete submissions (i.e. those with a file missing) will incur a minimum penalty of 10%.
¢ All code submitted must compile and run on the laboratory machines used to teach this unit. It is
your responsibility to ensure that your code will work correctly on the laboratory machines. Any code which
will not work on the laboratory computers will be treated as having compilation errors.
THE MANCHESTER METROPOLITAN UNIVERSITY
SCHOOL OF COMPUTING, MATHEMATICS AND DIGITAL
TECHNOLOGY
ASSIGNMENT COVER SHEET
COURSE MODULE: 6G4Z1001 Programming
LECTURER: K. Yates, N. Costen, M. Amos
ASSIGNMENT NUMBER: 1
ASSIGNMENT TYPE: Individual CW1
ISSUE DATE:
HAND IN DATE AND TIME: See Submission link on moodle
HAND BACK DATE :
PENALTIES FOR LATE HAND IN OF WORK: See the Regulations for
Undergraduate Programmes of Study.
MITIGATING CIRCUMSTANCES: See the Faculty Student Handbook.
PROPORTION OF COURSEWORK MARKS
AVAILABLE FROM THIS ASSIGNMENT: 40% of the unit
ASSESSMENT CRITERIA: See Attached Sheet
No responsibility is accepted by the School if an assignment is lost. Tocover this eventuality, you are advised to
ensure you have the means of re-creating it. This is your responsibility.
PLAGIARISM:Students are reminded that plagiarism is a serious disciplinary matter. Working together is
encouraged, but direct sharing of code/written work is not.
Checks are regularly made for misuse of the web and other existing materials, and electronic submission will often
include automated plagiarism checks.
In addition to the assessment criteria described in the assignment brief, we may, in exceptional circumstances,
require that you attend a viva in order to ascertain your contribution to the work. You MUST attend a viva if asked
to do so. Failure to attend will result in a mark of Zero for this assignment.
Should you be asked to attend a viva, this will be done via student email, after the hand-in date for the assignment.
PROCEDURE FOR HANDING IN WORK:Via submission link on moodle.
1
6G4Z1001 Programming NPC Assignment 1
1 6G4Z1001 Programming RESIT Assignment
1.1 Specification
The field of text analysis makes use of many different tools in order toextract information about written works. One
of these methods is word length frequency, which is used to assesspatterns in writing (for example, one author might
prefer to use short, snappy words, whilst another might prefer longer words). Such information can be useful in helping
to identify (or rule out) the author of an anonymous piece of text. In its simplest form, frequency analysis calculates
the number of words of length 1, 2, 3, ¦, n, where n is the length of the longest word. For example, analysing the
sentence I am a man’ would produce the output 2, 1, 1, because there are
¢ 2 words of length 1, I’,a’;
¢ 1 word of length 2, am’;
¢ 1 word of length 3, man’.
This assignment requires you to produce a Java program which performs a text analysis, calculating a range of statistics
on a piece of text. You must also document the testing of the program. The specification for the assignment is as
follows:
1. The program is a stand alone Java application (not an applet) thatis it has a main method.
2. Its initial state is in a while loop, you enter text from the command line/terminal (or however your Java IDE
accepts text input).
3. Your assignment submission must include a Java file and a PDF file, the PDF file contains a pseudocode
construction of the algorithm used to solve the problem and also a range of test cases.
4. On moodle there is a file start.java, you can use this as a starting point for the assignment.
2 Useful Java Functions
The following String functions:
1. split (see code listing 4);
2. charAt;
3. indexOf.
are very useful in this assignment. Examples of their usage appearin the notes and online.
2.1 Assessment Levels
In each case, achievement at a given degree level requires that all lower levels also be fulfilled (so, for example, a
student aiming for a mark of 65% must complete all of both the 40-49% and 50-59% criteria before addressing the
criteria in 60-69%). We strongly advise you to complete, test (on the laboratory computers) and save your assignment
at each level before you go on to a higher one. In this way you can besure to have an earlier version to submit, even if
your final, complex, version has an error as the deadline approaches. In each case, the criteria listed are the minimum
requirements for that mark band. Additional marks within a band can be obtained by improving the quality of your
code, or implementing some of the features required for the next level.
MMU 2 CMDT
6G4Z1001 Programming NPC Assignment 1
2.1.1 pass (40%)
1. Compiles without error.
2. User can enter a line of data at the prompt/terminal.
3. The program displays the line the user entered.
4. The program counts the the frequency of each word length in the line of data and displays it on the termi-nal/prompt.
5. The program loops back to 2 and waits for another line to be typed; this is an infinite while loop (code is provided
for the loop in the listing in section 4).
6. The submitted PDF contains mention of the pseudocode used to approach the problem and evidence of some
simple test cases.
2.1.2 Third (40-49%)
1. All earlier criteria are meet.
2. The mean word length is evaluated and displayed at the prompt.
3. The algorithm does not count grammar terms to be part of the word, so
the cat sat on the mat!
consists of 1 word of length 2 and 5 words of length 3. That is mat is a 3letter word not a 4 letter word. The
algorithm processes exclamation marks, questions marks, full stops and commas correctly.
4. The submitted PDF contains a complete set of test cases.
2.1.3 2ii (50-59%)
1. All earlier criteria are meet.
2. The information is displayed in bar chart form, theyaxis denotes the word length, thexaxis its frequency.
3. The visual appearance of the bar chart scales in an intelligent manner, to expand on this, consider the two
sentences:
¢ I say by the way’
¢ antidisestablishmentarianism is a good idea’
For the first sentence the bar chart on they axis goes from 1 to 3, whilst the bar chart on they axis for the
second sentence goes from 1 to 28. Your graphical output should ensure that in the first case the bar width
is approximately
1
3
on the screen, whilst in the second it should be
1
28
; that is whatever the data set is the
algorithm ensures the bar chart occupies a large part of the drawing window. The same argument applies to the
scaling of the height of the bar chart as well.
MMU 3 CMDT
6G4Z1001 Programming NPC Assignment 1
2.1.4 2i (60-69%)
1. All earlier criteria are meet.
2. In addition to being able to enter text in the textfield the user hasan option to load a text file, from the program
prompt:
load filename
will process the text file called filename, that is load is a keyword, whenthe program meets it it needs to process
it as a special command not as a data line.
3. The analysis is displayed correctly in bar chart form.
4. The visual representation of the data scales dynamically with thedata set, see criteria 2.1.3-3.
5. The code is well-structured and uses parameterised methods (of your own creation) in an intelligent manner.
2.1.5 i (70-79%)
1. All earlier criteria are meet.
2. Another keyword swap is implemented it takes two arguments, two characters to be swapped. In all text entered
after the swap command, the display shows the two characters switched.
2.1.6 80+
1. All earlier criteria meet to an exceptionally high standard.
3 Trial Session
We present some trial output of a working program for a Mr Keith Yates with i.d 123456 . Note the code just illustrates
aspects of the specification. The use typed the cats sat on the mat’, then
1. the program echoes back the line, required by criteria 2.1.1-3
2. the program displays the frequency of the word lengths, required by criteria 2.1.1-4
3. the program displays the frequency of word lengths in the form of a bar chart, required by criteria 2.1.3-2
java Yates Keith123456
the cats sat on a mat
the cats sat on a mat
There are 0 words of length 0
There are 1 words of length 1
There are 1 words of length 2
There are 3 words of length 3
There are 1 words of length 4
There are 0 words of length 5
Length 0
Length 1
Length 2
Length 3
Length 4
Length 5
MMU 4 CMDT
6G4Z1001 Programming NPC Assignment 1
4 Skeletal Code
Skeletal code’
import java . u t i l . Scanner ;
public class YatesKeith123456{
5 public static void main( String args [ ] )
{
Scanner scan =new Scanner (System . in ) ;
String l i n e= ;
while(( l i n e=scan . nextLine ())!=null)
{
System . out . println ( l i n e ) ;
}
15 }
}
5 Submission Procedure
The following are to be submitted viaMoodleby the deadline:
1. AJavafileSurname_Forename_grpNumber.java. In the Javafile there will be aclasswhich has the same
name as theJavafile.
2. AnPDFfileSurname_Forename_grpNumber.pdf.
Note that submissions which are misnamed, incomplete or do not compile will be subject to penalty. Conformity
to the submission requirements will be enforced by the following rules (note that these penalties may be combined):
¢ Misnaming will incur a penalty of 5%, deducted from the mark which would otherwise have been given.
¢ Each alteration not relating to names, required to compile the code,will incur a 5% penalty. Iffiveor more
lines must be changed, the submission will attract a failing mark regardless of the mark it would obtain if they
were corrected.
¢ Incomplete submissions (i.e. those with a file missing) will incur a minimum penalty of 10%.
¢ All code submittedmust compile and runon thelaboratory machines used to teach this unit. It is
your responsibility to ensure that your code will work correctly on the laboratory machines. Any code which
will not work on the laboratory computers will be treated as having compilation errors.
MMU 5 CMDT