Just like you can read data from a file on your computer, you can read data from a file on the Web. In addition to reading data from a local file on a computer or file server, you can also access data from a file that is on the Web if you know the file’s URL (Uniform Resource Locator—the unique address for a file on the Web). For example, www.google.com/index.html is the URL for the file index.html located on the Google Web server. When you enter the URL in a Web browser, the Web server sends the data to your browser, which renders the data graphically. Figure below illustrates how this process works.
For an application program to read data from a URL, you first need to create a URL object using the java.net.URL class with this constructor:
public URL(String spec) throws MalformedURLException
For example, the following statement creates a URL object for http://www.google.com/index.html.
try {
URL url = new URL("http://www.google.com/index.html");
}
catch (MalformedURLException ex) {
ex.printStackTrace();
}
A MalformedURLException is thrown if the URL string has a syntax error. For example, the URL string “http:www.google.com/index.html” would cause a MalformedURLException runtime error because two slashes (//) are required after the colon (:). Note that the http:// prefix is required for the URL class to recognize a valid URL. It would be wrong if you replace line 2 with the following code:
URL url = new URL("www.google.com/index.html");
After a URL object is created, you can use the openStream() method defined in the URL class to open an input stream and use this stream to create a Scanner object as follows:
Scanner input = new Scanner(url.openStream());
Now you can read the data from the input stream just like from a local file. The example in program below prompts the user to enter a URL and displays the size of the file.
The program prompts the user to enter a URL string (line 8) and creates a URL object (line 11). The constructor will throw a java.net.MalformedURLException (line 21) if the URL isn’t formed correctly.
The program creates a Scanner object from the input stream for the URL (line 13). If the URL is formed correctly but does not exist, an IOException will be thrown (line 24). For example, http://google.com/index1.html uses the appropriate form, but the URL itself does not exist. An IOException would be thrown if this URL was used for this program.