Details
-
Type: Improvement
-
Status: Open
-
Priority: Minor
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: Linking Report
-
Labels:None
Description
Yes, probably worth looking into. There appear to be settable configurations to HTTPClient that allow the cookie policy be more lenient.
On 12/29/11 12:27 PM, Autumn Francesca wrote:
Yikes! Ideally, should we add a special case for this? i.e. should I file a feature enhancement in JIRA? Otherwise the Link Scoring tool would be unusable for these scenarios.
Thanks for looking into this.
From: Patrick Wynne
Sent: Thursday, December 29, 2011 12:23 PM
To: Autumn Francesca
Subject: Re: FW: Link Scoring Issue
Most likely to do with this: http://hc.apache.org/httpclient-3.x/cookies.html
RFC2109 is the first official cookie specification released by the W3C. Theoretically, all servers that handle version 1 cookies should use this specification and as such this specification is used by default within HttpClient.
Unfortunately, many servers either incorrectly implement this standard or are still using the Netscape draft so occasionally this specification is too strict. If this is the case, you should switch to the compatibility specification as described below.
RFC2109 is available at http://www.w3.org/Protocols/rfc2109/rfc2109.txt
RFC2109 is the default cookie policy used by HttpClient.
On 12/29/11 12:15 PM, Autumn Francesca wrote:
But the header checker tool returns a 200 per Brandon. I didn't try it myself.
And besides, what the heck is the error, "domain must start with a dot."
From: Patrick Wynne
Sent: Thursday, December 29, 2011 12:13 PM
To: Autumn Francesca
Subject: Re: FW: Link Scoring Issue
I re-ran that report while monitoring the log output. Here is the output of processing on one of those URLS:
2011-12-29 08:57:35,135 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) scanPageSource BEGIN: 1037751::http://emedicine.medscape.com/pediatrics_general
2011-12-29 08:57:35,135 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) Execute GET Method for http://emedicine.medscape.com/pediatrics_general
2011-12-29 08:57:35,164 WARN [org.apache.commons.httpclient.HttpMethodBase] (WorkManager(2)-19) Cookie rejected: "$Version=0; NSC_mch_nfetdbqf.dpn=ffffffffaf12385645525d5f4f58455e445a4a423660; $Path=/; $Domain=medscape.com". Domain attribute "medscape.com" violates RFC 2109: domain must start with a dot
2011-12-29 08:57:35,219 WARN [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) Page scanning exception: javax.net.ssl.SSLHandshakeException
2011-12-29 08:57:35,219 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) scanPageSource FINISHED: 1037751::http://emedicine.medscape.com/pediatrics_general
2011-12-29 08:57:35,220 INFO [com.thesearchagency.service.linkscore.worker.pagescan.PageScanner] (WorkManager(2)-19) LSR ID: 1576 - pageScanner [8027] - finished link target: [1037751 :: http://emedicine.medscape.com/pediatrics_general]
Since the page scanner receives a non-200 response when performing an http GET on these links, it records the error response and moves on.
Patrick
On 12/29/11 11:32 AM, Autumn Francesca wrote:
I just tried rerunning this again a few minutes ago, hoping it was a timing thing.... But same deal. I also checked the URLs which are live. Hmmmmm
HTTP status code error: javax.net.ssl.SSLHandShakeException
From: Jeff Shih
Sent: Wednesday, December 28, 2011 5:31 PM
To: Autumn Francesca; Jeffrey Collemer
Subject: Link Scoring Issue
Hey Autumn,
I usually send these to Patrick to help me investigate. I know you said not to send him any ADR issues directly. How about Link Scoring?
Jeffrey Shih | Technical Account Manager
The Search Agency, Inc.
Phone: 310-582-5700 x6032 | Mobile: 310-422-6583
**Effective October 1, 2011, please note our change of address:
11150 W. Olympic Blvd., Suite 600, Los Angeles, CA 90064
Jeff.Shih@thesearchagency.com
www.thesearchagency.com
:: Check out our blog at www.thesearchagents.com ::
:: Named to Deloitte's Technology Fast 500 ::
Please consider the environment before printing this e-mail.
From: [Sent By Eventum] Brandon Schakola software.support@thesearchagency.com
Sent: Wednesday, December 28, 2011 12:00 PM
To: Jeff Shih
Subject: 9068 http___pngal650-dc1bos0002.dc1bos.thesearchagency.com_8080_users_540_linkscore_reports_linkscorereport_1573_2011-12-27_16_52_47.0-1.csv
Hi,
I'm having an issue with the Link Scoring Tool while trying to run on some Medscape.com URLs. You can check report ID's: 1571, 1572, 1573 for additional attempts.
I'm receiving an HTTP status code error: javax.net.ssl.SSLHandShakeException
I have checked the headers of the URLs through the header checker tool, but they respond with a Server 200 code.
Please advise,
Thank You,
Brandon
Brandon Schakola | Account Manager SEO