Python urllib2: recibe una respuesta JSON de la URL


90

Estoy tratando de OBTENER una URL usando Python y la respuesta es JSON. Sin embargo, cuando corro

import urllib2
response = urllib2.urlopen('https://api.instagram.com/v1/tags/pizza/media/XXXXXX')
html=response.read()
print html

The html is of type str and I am expecting a JSON. Is there any way I can capture the response as JSON or a python dictionary instead of a str.


1
Is response.read() returning a valid JSON string?
Martijn Pieters

Yes its a valid JSON string its just or type str and not dict
Deepak B

If it's a JSON representation of a string, rather than a JSON representation of an object (dict), you can't force the server to return you different data; you probably need to make a different request. If it's just that you don't know how to parse a JSON representation into the equivalent Python object, Martjin Pieters' answer is correct.
abarnert

Respuestas:


183

If the URL is returning valid JSON-encoded data, use the json library to decode that:

import urllib2
import json

response = urllib2.urlopen('https://api.instagram.com/v1/tags/pizza/media/XXXXXX')
data = json.load(response)   
print data

1
@ManuelSchneid3r: The answer here is for Python 2, where reading from response gives you bytestrings, and json.load() expects to read a bytestring. JSON must be encoded using a UTF codec, and the above works for UTF-8, UTF-16 and UTF-32, provided a BOM codepoint is included for the latter two codecs. The answer you link to presumes UTF-8 was used, which is usually correct because that's the default. As of Python 3.6, the json library auto-decodes bytecodes with JSON data provided a UTF encoding is used.
Martijn Pieters

@ManuelSchneid3r: I'd otherwise recommend you use the requests library, which also automatically detects the correct UTF codec to use in cases where the BOM is missing and no characterset was specified in the response header. Just use the response.json() method.
Martijn Pieters

35
import json
import urllib

url = 'http://example.com/file.json'
r = urllib.request.urlopen(url)
data = json.loads(r.read().decode(r.info().get_param('charset') or 'utf-8'))
print(data)

urllib, for Python 3.4
HTTPMessage, returned by r.info()


1
Solid code other than print data being incorrect for Python 3. Should be print(data).
David Metcalfe

1
Yes and line 2 should be import urllib.request . Also, that .json file in the url no longer exists.
hack-tramp

5
"""
Return JSON to webpage
Adding to wonderful answer by @Sanal
For Django 3.4
Adding a working url that returns a json (Source: http://www.jsontest.com/#echo)
"""

import json
import urllib

url = 'http://echo.jsontest.com/insert-key-here/insert-value-here/key/value'
respons = urllib.request.urlopen(url)
data = json.loads(respons.read().decode(respons.info().get_param('charset') or 'utf-8'))
return HttpResponse(json.dumps(data), content_type="application/json")

1
whew, that json.dumps() saved my day.
Lloyd

In case of Django 1.7 + , you could use JsonResponse directly as follows from django.http import JsonResponse return JsonResponse({'key':'value'})
raccoon

1
I was doing json.dump() instead of json.dumps(), feeling dumb, Thanks for the save!
Hashir Baig

4

Be careful about the validation and etc, but the straight solution is this:

import json
the_dict = json.load(response)

2
resource_url = 'http://localhost:8080/service/'
response = json.loads(urllib2.urlopen(resource_url).read())

1

Python 3 standard library one-liner:

load(urlopen(url))

# imports (place these above the code before running it)
from json import load
from urllib.request import urlopen
url = 'https://jsonplaceholder.typicode.com/todos/1'

0

Though I guess it has already answered I would like to add my little bit in this

import json
import urllib2
class Website(object):
    def __init__(self,name):
        self.name = name 
    def dump(self):
     self.data= urllib2.urlopen(self.name)
     return self.data

    def convJSON(self):
         data=  json.load(self.dump())
     print data

domain = Website("https://example.com")
domain.convJSON()

Note : object passed to json.load() should support .read() , therefore urllib2.urlopen(self.name).read() would not work . Doamin passed should be provided with protocol in this case http


0

you can also get json by using requests as below:

import requests

r = requests.get('http://yoursite.com/your-json-pfile.json')
json_response = r.json()

0

This is another simpler solution to your question

pd.read_json(data)

where data is the str output from the following code

response = urlopen("https://data.nasa.gov/resource/y77d-th95.json")
json_data = response.read().decode('utf-8', 'replace')

-1

None of the provided examples on here worked for me. They were either for Python 2 (uurllib2) or those for Python 3 return the error "ImportError: No module named request". I google the error message and it apparently requires me to install a the module - which is obviously unacceptable for such a simple task.

This code worked for me:

import json,urllib
data = urllib.urlopen("https://api.github.com/users?since=0").read()
d = json.loads(data)
print (d)

2
You are evidently using Python 2. In Python 3, there is no urllib.urlopen; urlopen is in the urllib.request module.
Nick Matteo
Al usar nuestro sitio, usted reconoce que ha leído y comprende nuestra Política de Cookies y Política de Privacidad.
Licensed under cc by-sa 3.0 with attribution required.