Question:
I’m making request to LinkedIn page and receiving “HTTP/1.1 999 Request denied” response.
I use AWS/EC-2 and get this response.
On localhost everything works fine.
This is sample of my code to get html-code of the page.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
error_reporting(E_ALL); $url= 'https://www.linkedin.com/pulse/5-essential-strategies-digital-michelle'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_HEADER, true); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); $response = curl_exec($ch); $info = curl_getinfo($ch); curl_close($ch); var_dump($response); var_dump($info); |
I don’t need whole page content, just meta-tags (title, og-tags).
Answer:
Note that the error 999 don’t exist in W3C Hypertext Transfer Protocol – HTTP/1.1, probably this error is customized (sounds like a joke)
LinkedIn don’t allow direct access, the probable reason of them blocking any “url” from others webservers access should be to:
- Prevent unauthorized copying of information
- Prevent invasions
- Prevent abuse of requests.
- Force use API
Some IP addresses of servers are blocked, as the “IP” from “domestic ISP” are not blocked and that when you access the LinkedIn with web-browser you use the IP of your internet provider.
The only way to access the data is to use their APIs. See:
Note: The search engines like Google and Bing probably have their IPs in a “whitelist”.