Uploaded image for project: 'AdMax'
  1. AdMax
  2. ADMAX-2950

HTTPClient non-200 response code special case needed for W3C Cookie Policies

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Linking Report
    • Labels:
      None

      Description

      Yes, probably worth looking into. There appear to be settable configurations to HTTPClient that allow the cookie policy be more lenient.

      On 12/29/11 12:27 PM, Autumn Francesca wrote:

      Yikes! Ideally, should we add a special case for this? i.e. should I file a feature enhancement in JIRA? Otherwise the Link Scoring tool would be unusable for these scenarios.

      Thanks for looking into this.

      From: Patrick Wynne

      Sent: Thursday, December 29, 2011 12:23 PM

      To: Autumn Francesca

      Subject: Re: FW: Link Scoring Issue

      Most likely to do with this: http://hc.apache.org/httpclient-3.x/cookies.html

      RFC2109 is the first official cookie specification released by the W3C. Theoretically, all servers that handle version 1 cookies should use this specification and as such this specification is used by default within HttpClient.

      Unfortunately, many servers either incorrectly implement this standard or are still using the Netscape draft so occasionally this specification is too strict. If this is the case, you should switch to the compatibility specification as described below.

      RFC2109 is available at http://www.w3.org/Protocols/rfc2109/rfc2109.txt

      RFC2109 is the default cookie policy used by HttpClient.

      On 12/29/11 12:15 PM, Autumn Francesca wrote:

      But the header checker tool returns a 200 per Brandon. I didn't try it myself.

      And besides, what the heck is the error, "domain must start with a dot."

      From: Patrick Wynne

      Sent: Thursday, December 29, 2011 12:13 PM

      To: Autumn Francesca

      Subject: Re: FW: Link Scoring Issue

      I re-ran that report while monitoring the log output. Here is the output of processing on one of those URLS:

      2011-12-29 08:57:35,135 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) scanPageSource BEGIN: 1037751::http://emedicine.medscape.com/pediatrics_general

      2011-12-29 08:57:35,135 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) Execute GET Method for http://emedicine.medscape.com/pediatrics_general

      2011-12-29 08:57:35,164 WARN [org.apache.commons.httpclient.HttpMethodBase] (WorkManager(2)-19) Cookie rejected: "$Version=0; NSC_mch_nfetdbqf.dpn=ffffffffaf12385645525d5f4f58455e445a4a423660; $Path=/; $Domain=medscape.com". Domain attribute "medscape.com" violates RFC 2109: domain must start with a dot

      2011-12-29 08:57:35,219 WARN [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) Page scanning exception: javax.net.ssl.SSLHandshakeException

      2011-12-29 08:57:35,219 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) scanPageSource FINISHED: 1037751::http://emedicine.medscape.com/pediatrics_general

      2011-12-29 08:57:35,220 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) LSR ID: 1576 - pageScanner [8027] - finished link target: [1037751 :: http://emedicine.medscape.com/pediatrics_general]

      Since the page scanner receives a non-200 response when performing an http GET on these links, it records the error response and moves on.

      Patrick

      On 12/29/11 11:32 AM, Autumn Francesca wrote:

      I just tried rerunning this again a few minutes ago, hoping it was a timing thing.... But same deal. I also checked the URLs which are live. Hmmmmm

      HTTP status code error: javax.net.ssl.SSLHandShakeException

      From: Jeff Shih

      Sent: Wednesday, December 28, 2011 5:31 PM

      To: Autumn Francesca; Jeffrey Collemer

      Subject: Link Scoring Issue

      Hey Autumn,

      I usually send these to Patrick to help me investigate. I know you said not to send him any ADR issues directly. How about Link Scoring?

      Jeffrey Shih | Technical Account Manager

      The Search Agency, Inc.

      Phone: 310-582-5700 x6032 | Mobile: 310-422-6583

      **Effective October 1, 2011, please note our change of address:

      11150 W. Olympic Blvd., Suite 600, Los Angeles, CA 90064

      Jeff.Shih@thesearchagency.com

      www.thesearchagency.com

      :: Check out our blog at www.thesearchagents.com ::

      :: Named to Deloitte's Technology Fast 500 ::

      Please consider the environment before printing this e-mail.

      From: [Sent By Eventum] Brandon Schakola software.support@thesearchagency.com

      Sent: Wednesday, December 28, 2011 12:00 PM

      To: Jeff Shih

      Subject: 9068 http___pngal650-dc1bos0002.dc1bos.thesearchagency.com_8080_users_540_linkscore_reports_linkscorereport_1573_2011-12-27_16_52_47.0-1.csv

      Hi,

      I'm having an issue with the Link Scoring Tool while trying to run on some Medscape.com URLs. You can check report ID's: 1571, 1572, 1573 for additional attempts.

      I'm receiving an HTTP status code error: javax.net.ssl.SSLHandShakeException

      I have checked the headers of the URLs through the header checker tool, but they respond with a Server 200 code.

      Please advise,

      Thank You,

      Brandon

      Brandon Schakola | Account Manager SEO

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              autumn Autumn (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated: