How to Extract Title from Web Pages in Java

Extract Title from Web Pages in Java

In this example, We will show you simple program about, How to extract title from web pages in Java. This example was build using Java Jsoup API. Which is used to process the HTML documents from URL or any source. The example program has been tested with environment and output shared in the same post.

Example Program (WebUtils.java)

package com.dineshkrish;

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

/**
 * 
 * @author Dinesh Krishnan
 *
 */

public class WebUtils {

	// method to extract title from url
	public String getTitle(final String link) {

		String title = null;

		try {

			// creating URL object
			URL url = new URL(link);

			// getting the HTML documents from the url
			Document document = Jsoup.parse(url, 5000);

			// extracting the title from given url
			title = document.title();
			

		} catch (MalformedURLException e) {

			System.out.println(e.getMessage());
			e.printStackTrace();
		} catch (IOException e) {

			System.out.println(e.getMessage());
			e.printStackTrace();
		}

		return title;
	}

	public static void main(String[] args) {

		// input url you can change accordingly
		String link = "http://www.google.com";

		WebUtils utils = new WebUtils();

		// printing the extracted title
		System.out.println(utils.getTitle(link));
	}
}

Maven Dependency (pom.xml)

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<groupId>com.dineshkrish</groupId>
	<artifactId>JsoupExample</artifactId>
	<version>0.0.1-SNAPSHOT</version>

	<dependencies>

		<dependency>
			<groupId>org.jsoup</groupId>
			<artifactId>jsoup</artifactId>
			<version>1.9.2</version>
		</dependency>

	</dependencies>

</project>

Output

Google

References

1. Jsoup Documentation
2. JavaDoc – Java JSoup API
3. JavaDoc – Jsoup Class