I am trying to download the html content of a web page for e.g http://www.jobstreet.com.sg/ I am getting the 416 status. I found one solution which correctly improves the status code as 200 but still not downloading the proper content. I am very close but missing something. Please help.
Code with 416 status:
public static void main(String[] args) { String URL="http://www.jobstreet.com.sg/"; HttpClient client = new org.apache.commons.httpclient.HttpClient(); org.apache.commons.httpclient.methods.GetMethod method = new org.apache.commons.httpclient.methods.GetMethod(URL); client.getHttpConnectionManager().getParams().setConnectionTimeout(AppConfig.CONNECTION_TIMEOUT); client.getHttpConnectionManager().getParams().setSoTimeout(AppConfig.READ_DATA_TIMEOUT); String html = null; InputStream ios = null; try { int statusCode = client.executeMethod(method); ios = method.getResponseBodyAsStream(); html = IOUtils.toString(ios, "utf-8"); System.out.println(statusCode); }catch (Exception e) { e.printStackTrace(); } finally { if(ios!=null) { try {ios.close();} catch (IOException e) {e.printStackTrace();} } if(method!=null) method.releaseConnection(); } System.out.println(html);}Code with 200 status (but htmlContent is not proper):
public static void main(String[] args) {
String URL="http://www.jobstreet.com.sg/"; HttpClient client = new org.apache.commons.httpclient.HttpClient(); org.apache.commons.httpclient.methods.GetMethod method = new org.apache.commons.httpclient.methods.GetMethod(URL); client.getHttpConnectionManager().getParams().setConnectionTimeout(AppConfig.CONNECTION_TIMEOUT); client.getHttpConnectionManager().getParams().setSoTimeout(AppConfig.READ_DATA_TIMEOUT); String html = null; InputStream ios = null; try { int statusCode = client.executeMethod(method); if(statusCode == HttpStatus.SC_REQUESTED_RANGE_NOT_SATISFIABLE) { method.setRequestHeader("User-Agent", "Mozilla/5.0"); method.setRequestHeader("Accept-Ranges", "bytes=100-1500"); statusCode = client.executeMethod(method); } ios = method.getResponseBodyAsStream(); html = IOUtils.toString(ios, "utf-8"); System.out.println(statusCode); }catch (Exception e) { e.printStackTrace(); } finally { if(ios!=null) { try {ios.close();} catch (IOException e) {e.printStackTrace();} } if(method!=null) method.releaseConnection(); } System.out.println(html);}
ليست هناك تعليقات:
إرسال تعليق