The String, Character, and StringBuilder Classes (and File I/O)

The String Class

String Construction and Initialization

There are several ways to create a String. We can use the normal constructor:

String myString = new String("I am a new String!");
String myEmptyString = new String(); 

We can use Java's shorthand initializer to the same end:

String myString = "I am a new String!";
String myEmptyString = "";

Strings are Immutable

It is important to realize that a String object is immutable. That is to say, the contents of a String never actually change from what they were when the string was when it was created. This may seem a bit strange, especially since we see "re-definitions" of Strings all the time in the code we write. But be careful not to confuse the String itself with the reference variable that points to it.

For example, in the following code, two String objects are created and the reference variable s points first to "Java" and later to "HTML":

String s = "Java";   //creates a new string "Java" which is referenced by s
s = "HTML";          //s now references a new string "HTML" -- the earlier
                     //string "Java" is no longer referenced by ANY variable

The immutability of the String class may seem very strange and highly restrictive after having seen how arrays work -- but it is an essential feature for the class. There are multiple reasons for this, but perhaps the most important is that Strings are used by the class loading mechanism, and mutability in such a context would have profound and fundamental security consequences.
For example, had String been mutable, a request to load "java.io.Writer" could have been changed to load "mil.vogoon.DiskErasingWriter"

The java.lang.String class has several useful methods associated with it, such as:

java.lang.String
+equals(s1: String): boolean Returns true if this String is equal to s1
+equalsIgnoreCase(s1: String): boolean Returns true if this String is equal to String s1 case insensitive
+compareTo(s1: String): int Returns an integer greater than 0, equal to 0, or less than 0, to indicate whether this string is lexicographically greater than, equal to, or less than s1, respectively. For example "apple".compareTo ("banana") would return a negative integer to indicate "apple" comes before "banana" in the dictionary.
+compareToIgnoreCase(s1: String): int Same as compareTo except that the comparison is case-insensitive
+regionMatches(toffset: int, s1: String, offset: int, len: int): boolean Returns true if the specified subregion of this string exactly matches the specified subregion in String s1;
+regionMatches(ignoreCase: boolean, toffset: int, s1: String, offset: int, len: int): boolean Same as the preceding method except that you can specify whether the match is case-sensitive
+startsWith(prefix: String): boolean Returns true if this string starts with the specified prefix
+endsWith(suffix: String): boolean Returns true if this string ends with the specified suffix

Comparing Strings

Don't get comparing the contents of Strings confused with comparing their reference variables. For example, we use the equals() method to compare the contents of Strings, while if we use the "double-equals" operator we get something completely different:

String s1 = new String("Welcome");
String s2 = new String("welcome");

if (s1.equals(s2)) {
   //s1 and s2 have the same contents
}

if (s1 == s2) {
   //s1 and s2 have the same reference
}

We can compare the contents of Strings lexicographically with the compareTo() method:

String s1 = new String("Welcome");
String s2 = "welcome";

if (s1.compareTo(s2) > 0 {
   // s1 is greater than s2
}
else if (s1.compareTo(s2) == 0) {
   //s1 and s2 have the same contents
}
else
   // s1 is less than s2

String Length, Characters, and Combining Strings

Below are some more very useful methods of the String class:

java.lang.String
+length(): int Returns the number of characters in this String
+charAt(index: int): char Returns the character at the specified index from this String
+concat(s1:String): String Returns a new String that is the result of concatenation of this String with String s1

As examples of these, consider:

String message = "Welcome";
int strLen = message.length();     //strLen now equals 7
int doubleU = message.charAt(0);   //doubleU now equals 'W'
int em = message.charAt(5);        //em now equals "m"

//Even though this closely parallels the indexing of arrays,
//do NOT use message[i] to access the chars of a String

We can concatenate (or join) Strings in two ways - either with the concat() method or with the "+" operator:

String s1 = new String("Hello ");
String s2 = "World";
String s3 = s1.concat(s2);  //s3 now contains "Hello World"
String s4 = s1 + s2;        //s4 also contains "Hello World"

String s5 = s1 + s2 + s3 + s4; 
// s5 now contains "Hello WorldHello WorldHelloWorld"

String s6 = ((s1.concat(s2)).concat(s3)).concat(s4);
// s6 holds the same string now as s5 
// this is just another way to do the same thing

Extracting Substrings

Not only can we get bigger strings by concatenating smaller strings together, we can use the following methods to extract a smaller, substring from a given string:

java.lang.String
+subString(beginIndex: int): String Returns a substring of this string that begins with the character at the specified beginIndex and extends to the end of the string
+subString(beginIndex: int, endIndex: int): String Returns a substring of this string that begins at the specified beginIndex and extends to the character at index (endIndex - 1). Note: the character at endIndex is NOT part of the substring.

So for example,

String s1 = "0123456789";
String s2 = s1.substring(2,6);   //s2 now equals "2345"
String s3 = s1.substring(2);     //s3 now equals "23456789"

Converting Case, Replacing Parts of Strings, and Splitting Strings

We can also convert a string from uppercase to lowercase and vice-versa, replace a substring of a string with something else, and split strings according to the positions of a designated character called a delimeter...

java.lang.String
+toLowerCase(): String Returns a new string with all characters converted to lowercase
+toUpperCase(): String Returns a new string with all characters converted to uppercase
+trim(): String Returns a new string with any blank characters at the beginning or end of the string removed
+replace(oldChar: char, newChar: char): String Returns a new string where all of the characters of this string that match oldChar have been replaced with newChar
+replaceFirst(oldString: String, newString: String): String Returns a new string where the first substring that matches oldString has been replaced with newString
+replaceAll(oldString: String, newString: String): String Returns a new string where all substrings matching oldString have been replaced with newString
+split(delimiter: String): String[] Returns an array of strings consisting of the substrings split by the delimeter

Here are some examples of the above methods:

String s1 = "Welcome";
String s2 = s1.toLowerCase();       //s2 = "welcome"
String s3 = s1.toUpperCase();       //s3 = "WELCOME"
String s4 = s1.replace('e', 'A');   //s4 = "WAlcomA"

And here is an example of how to split a string into an array of strings along a specified delimeter and then use the resulting tokens to do something:

String s = "Cat-Dog-Mouse";
String[] tokens = s.split("-");    
//now tokens is the array {"Cat", "Dog", "Mouse"}

System.out.println(tokens[2] + " eats " + tokens[0]);
//prints "Mouse eats Cat"

Locating Characters or Substrings in a String

Some methods of the java.lang.String class can be used to find the location (i.e., in terms of indices) of a character or substring within a string:

java.lang.String
+indexOf(ch: char): int Returns the index of the first occurrence of ch in the string (returns -1 if not matched)
+indexOf(ch: char, fromIndex: int): int Returns the index of the first occurrence of ch after fromIndex in the String (returns -1 if not matched)
+indexOf(s: String): int Returns the index of the first occurrence of String s in this String (returns -1 if not matched)
+indexOf(s: String, fromIndex: int):int Returns the index of the first occurrence of String s after fromIndex in the String (returns -1 if not matched)
+lastIndexOf(ch: char): int Returns the index of the last occurrence of ch in the string (returns -1 if not matched)
+lastIndexOf(ch: char, fromIndex: int): int Returns the index of the last occurrence of ch before fromIndex in the String (returns -1 if not matched)
+lastIndexOf(s: String): int Returns the index of the last occurrence of String s in this String (returns -1 if not matched)
+lastIndexOf(s: String, fromIndex: int):int Returns the index of the last occurrence of String s before fromIndex in the String (returns -1 if not matched)

So for example,

String s = "Welcome to Java";

int a = s.indexOf('W')           //a = 0
int b = s.indexOf('x')           //b = -1
int c = s.indexOf('o',5)         //c = 9
int d = s.indexOf("come")        //d = 3
int e = s.indexOf("Java", 5)     //e = 11
int f = s.indexOf("java", 5)     //f = -1
int g = s.lastIndexOf('a')       //g = 14

The Character Class

The Character class is a convenient "wrapper class" around the primitive char type. It stores essentially the same thing as a char, but provides access to lots of useful methods involving this data type too. Note many of the methods are static methods and hence should be called not from a Character object but from the Character class. Some of these methods are shown below. (The static methods are underlined.)

java.lang.Character
+Character(value: char) Constructs a Character object with char value
+charValue(): char Returns the char value from this object
+compareTo(anotherCharacter: Character): int compares this Character with another lexicographically (just like the compareTo() method of the String class)
+equals(anotherCharacter: Character): boolean Returns true if this Character equals anotherCharacter
+isDigit(ch: char): boolean Returns true if the specified char is a digit
+isLetter(ch: char): boolean Returns true if the specified char is a letter
+isLetterOrDigit(ch: char): boolean Returns true if the specified char is a letter or digit
+isLowerCase(ch: char): boolean Returns true if the specified char is a lowercase letter
+isUpperCase(ch: char): boolean Returns true if the specified char is an uppercase letter
+toLowerCase(ch: char): char Returns the lowercase equivalent of the specified char
+toUpperCase(ch: char): char Returns the uppercase equivalent of the specified char

The StringBuilder/StringBuffer Classes

StringBuilder/StringBuffer vs. String

The String class is clearly one of the most important classes in Java. No matter which kind of application you are working you will find heavy usage of this class, BUT you will also find that String is a class which creates lots of garbage because of many temporary Strings created in program.

One of the biggest strengths of the String class, "immutability", is also one of its biggest problems if it is not used correctly. Many times we create a String and then perform lot of operations on it (e.g. converting string into uppercase, lowercase , getting substring out of it , concatenating with other string etc.) Since String is an immutable class, in each of these cases a new String is created and the older one is discarded. This is what creates a lot of temporary garbage in the "heap". To resolve this problem Java provides us two classes: StringBuffer and StringBuilder. String Buffer is an older class but StringBuilder is relatively new and added in JDK 5. They are both "mutable" versions of a string and are largely identical (For the curious -- there is a slight difference. StringBuffer is better suited to tasks where thread safety is important -- this comes at a cost of speed, while StringBuilder is quicker but not as safe.)

In general, StringBuilder/StringBuffer can be used wherever a string is used. They are both more flexible than the String class in that you can add, insert, or append new contents to StringBuilder/StringBuffer objects, while the contents of String objects as previously mentioned, are fixed once the string is created.

StringBuilder Constructors

There are several constructors for StringBuilder objects:

java.lang.StringBuilder
+StringBuilder() Constructs an empty StringBuilder with capacity 16
+StringBuilder(capacity: int) Constructs a StringBuilder with the specified capacity
+StringBuilder(s: String) Constructs a StringBuilder with the specified string

Some Important StringBuilder Methods

One can find out more by examining the Java API, of course, but here are some of the more relevant StringBuilder methods:

java.lang.StringBuilder
+toString(): String Returns a String object from the StringBuilder
+capacity(): int Returns the capacity of this StringBuilder
+charAt(index: int): char Returns the char at the specified index
+length(): int Returns the number of characters in this StringBuilder
+setLength(newLength: int): void Sets a new length for this StringBuilder
+subString(startIndex: int): String Returns the substring starting at startIndex and continuing to the end of the StringBuilder content
+subString(startIndex: int, endIndex: int): String Returns the substring starting at startIndex to (endIndex-1)
+trimToSize(): void Reduces the storage size used for the StringBuilder
+append(str: String): StringBuilder Appends the specified string to this character sequence. (Note: this method has many overloaded forms where "str" is replaced by variables of various types, primitive and otherwise.)
+delete(start: int, end: int): StringBuilder Removes the characters in a substring of this sequence
+deleteCharAt(index: int): StringBuilder Removes the char at the specified position in this sequence
+insert(offset: int, str: String): StringBuilder Inserts the string into this character sequence at the indicated offset. This method is highly overloaded to allow for inserting other data types besides just Strings.
+replace(start: int, end: int, str: String): StringBuilder Replaces the characters in a substring of this sequence with characters in the specified String.
+setCharAt(index: int, ch: char): void The character at the specified index is set to ch.
+reverse(): StringBuilder Causes this character sequence to be replaced by the reverse of the sequence.

The File Class

The java.io.File class is used to obtain file properties and to delete and rename files. Essentially, it is a wrapper class for the file name and its directory path, intended to provide a level of abstraction to deal with most of the machine-dependent complexities of files and path names in a machine-independent fashion.

The following shows an example of using one of the constructors for the File class:

import java.io.File;

...

File f = new File("C:\\Users\\oser\\Desktop\\myfile.txt");

It is perhaps counter-intuitive, but the File class is NOT for reading and writing file contents -- it contains no methods for doing so. Instead file I/O (i.e., input/output) can be accomplished via the Scanner and PrintWriter classes, as described below:

java.util.Scanner
+Scanner(source: File) Creates a Scanner that produces values scanned from the specified file
+Scanner(source: String) Creates a Scanner that produces values scanned from the specified string
+close() Closes this Scanner
+hasNext(): boolean Returns true if this Scanner has another token in its input
+next(): String Returns next token as a String
+nextByte(): byte Returns next token as a byte
+nextShort(): short Returns next token as a short
+nextLong(): long Returns next token as a long
+nextFloat(): float Returns next token as a float
+nextDouble(): double Returns next token as a double
+useDelimeter(pattern: String): Scanner Sets this Scanner's delimiting pattern

As an example of using a Scanner on a String in combination with the useDelimeter() method, consider the following:

String s = "Cat-Dog-Mouse";
Scanner myScanner = new Scanner(s);
myScanner.useDelimiter("-");
System.out.println(myScanner.next());  //prints "Cat" on one line
System.out.println(myScanner.next());  //prints "Dog" on the next line
System.out.println(myScanner.next());  //prints "Mouse" on the last line

When creating a new Scanner to read the contents of a file, you will need to address what happens when the file specified can't be found. The typical way is to use a "try..catch" as exampled below:

File myFile = new File("//Users//someuser//Desktop//somefile.txt");  
//note, the above presumes an OS X or unix-based operating system
//on windows, the slashes are reversed and the path typically starts with "C:\"
      
try {
   fileScanner = new Scanner(myFile);
} 
catch (FileNotFoundException e) {
   System.out.println("File was not found!");
   e.printStackTrace();
}

Also, don't forget to call the close() method associated with your scanner when you are done using it.

When creating a new PrintWriter to write to a file, you will likewise need to address what happens when the file can't be found or accessed. A similar "try..catch" construction to the one used for Scanner above can be used. Also -- as is the case with scanners -- printWriters should be closed when you are done using them, by calling their close() method.

java.io.PrintWriter
+PrintWriter(filename: String) Creates a PrintWriter for the specified file
+close(): void closes this printWriter
+print(s: String): void Writes a String
+print(c: char): void Writes a char
+print(cArray: char[]): void Writes an array of char
+print(i: int): void Writes an int value
+print(l: long): void Writes a long value
+print(f: float): void Writes a float value
+print(d: double): void Writes a double value
+print(b: boolean): void Writes a boolean value
Also contains the overloaded println() methods A println method acts like a print method; additionally, except that it also prints a line separator. The line separator is defined by the system. It is "\r\n" on Windows systems and "\n" on Unix systems.
Also contains the overloaded printf() methods Recall that printf allows more precise control over the format of certain data types when printed