Python: BeautifulSoup: obtenga un valor de atributo basado en el atributo de nombre

Question 1

Quiero imprimir un valor de atributo basado en su nombre, tome por ejemplo

<META NAME="City" content="Austin">

Quiero hacer algo como esto

soup = BeautifulSoup(f) //f is some HTML containing the above meta tag
for meta_tag in soup('meta'):
    if meta_tag['name'] == 'City':
         print meta_tag['content']

El código anterior da un KeyError: 'name', creo que esto se debe a que BeatifulSoup usa el nombre, por lo que no se puede usar como un argumento de palabra clave.

Question 2

Es bastante simple, usa lo siguiente:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<META NAME="City" content="Austin">')
>>> soup.find("meta", {"name":"City"})
<meta name="City" content="Austin" />
>>> soup.find("meta", {"name":"City"})['content']
u'Austin'

Deje un comentario si algo no está claro.

Question 3

el más duro respondió la pregunta, pero aquí hay otra forma de hacer lo mismo. Además, en su ejemplo tiene NAME en mayúsculas y en su código tiene el nombre en minúsculas.

s = '<div class="question" id="get attrs" name="python" x="something">Hello World</div>'
soup = BeautifulSoup(s)

attributes_dictionary = soup.find('div').attrs
print attributes_dictionary
# prints: {'id': 'get attrs', 'x': 'something', 'class': ['question'], 'name': 'python'}

print attributes_dictionary['class'][0]
# prints: question

print soup.find('div').get_text()
# prints: Hello World

Question 4

6 años tarde en la fiesta, pero he estado buscando cómo extraer el valor del atributo de etiqueta de un elemento html , así que para:

<span property="addressLocality">Ayr</span>

Quiero "addressLocality". Seguí siendo dirigido de regreso aquí, pero las respuestas realmente no resolvieron mi problema.

Cómo me las arreglé para hacerlo eventualmente:

>>> from bs4 import BeautifulSoup as bs

>>> soup = bs('<span property="addressLocality">Ayr</span>', 'html.parser')
>>> my_attributes = soup.find().attrs
>>> my_attributes
{u'property': u'addressLocality'}

Como es un dictado, también puede usar keysy 'valores'

>>> my_attributes.keys()
[u'property']
>>> my_attributes.values()
[u'addressLocality']

¡Ojalá ayude a alguien más!

Question 5

Los siguientes trabajos:

from bs4 import BeautifulSoup

soup = BeautifulSoup('<META NAME="City" content="Austin">', 'html.parser')

metas = soup.find_all("meta")

for meta in metas:
    print meta.attrs['content'], meta.attrs['name']

Question 6

La respuesta de theharshest es la mejor solución, pero para su información, el problema que estaba encontrando tiene que ver con el hecho de que un objeto Tag en Beautiful Soup actúa como un diccionario de Python. Si accede a la etiqueta ['nombre'] en una etiqueta que no tiene un atributo de 'nombre', obtendrá un KeyError.

Question 7

También se puede probar esta solución:

Para encontrar el valor, que está escrito en el intervalo de la tabla

htmlContent

<table>
    <tr>
        <th>
            ID
        </th>
        <th>
            Name
        </th>
    </tr>


    <tr>
        <td>
            <span name="spanId" class="spanclass">ID123</span>
        </td>

        <td>
            <span>Bonny</span>
        </td>
    </tr>
</table>

Código Python

soup = BeautifulSoup(htmlContent, "lxml")
soup.prettify()

tables = soup.find_all("table")

for table in tables:
   storeValueRows = table.find_all("tr")
   thValue = storeValueRows[0].find_all("th")[0].string

   if (thValue == "ID"): # with this condition I am verifying that this html is correct, that I wanted.
      value = storeValueRows[1].find_all("span")[0].string
      value = value.strip()

      # storeValueRows[1] will represent <tr> tag of table located at first index and find_all("span")[0] will give me <span> tag and '.string' will give me value

      # value.strip() - will remove space from start and end of the string.

     # find using attribute :

     value = storeValueRows[1].find("span", {"name":"spanId"})['class']
     print value
     # this will print spanclass

Question 8

If tdd='<td class="abc"> 75</td>'
In Beautifulsoup 

if(tdd.has_attr('class')):
   print(tdd.attrs['class'][0])


Result:  abc