Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 7 of 7

Threaded View

  1. #1
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 294 Times in 227 Posts
    Blog Entries
    4

    Post How to Grab the HTML source code of a website URL index page?

    This code will grab the HTML source from a given URL.

    Change "website here.com" to a real URL starting with http:// and the program will display the index pages source code in the console.

    The nice thing about this code is it spoofs the connection to make it look like its a web browser.
    This enables you to navigate to sites like google that normally block connections from non web browser applications.

    import java.io.BufferedReader;
    import java.io.InputStreamReader;
    import java.net.URL;
    import java.net.URLConnection;
     
    public class GrabHTML {
     
     public static void Connect() throws Exception{
     
      //Set URL
      URL url = new URL("http://website here.com");
      URLConnection spoof = url.openConnection();
     
      //Spoof the connection so we look like a web browser
      spoof.setRequestProperty( "User-Agent", "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; H010818)" );
      BufferedReader in = new BufferedReader(new InputStreamReader(spoof.getInputStream()));
      String strLine = "";
     
      //Loop through every line in the source
      while ((strLine = in.readLine()) != null){
     
       //Prints each line to the console
       System.out.println(strLine);
      }
     
      System.out.println("End of page.");
     }
     
     public static void main(String[] args){
     
      try{
       //Calling the Connect method
       Connect();
      }catch(Exception e){
     
      }
     }
    }
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

  2. The Following 2 Users Say Thank You to JavaPF For This Useful Post:

    Bryan (April 22nd, 2010), dave0110 (December 3rd, 2010)


Similar Threads

  1. Source code for Email address book/contacts importer
    By jega004 in forum Java Theory & Questions
    Replies: 4
    Last Post: November 23rd, 2012, 12:49 PM
  2. [SOLVED] Books and sources for Java beginners
    By chronoz13 in forum Java Theory & Questions
    Replies: 1
    Last Post: April 15th, 2009, 08:36 AM
  3. Replies: 3
    Last Post: March 9th, 2009, 09:47 AM