Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
450 views
in Technique[技术] by (71.8m points)

parse json string from wikimedia using jquery

Im tring to get the infobox from wiki pages. For this I'm using wiki api. The following is the url from which I'm getting json data.

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=json&titles="+first+"&rvsection=0

Where first is a variable containing the article title for Wikipedia.

I'm finding it extremely complex to parse this data to make a meaningful html out of it.

I was usuing $.each function initially. But the loop is very deep that I had to use 6-7 times to get to the actual data that I want. I think there would be better alternative than this. Please help me.

json data for reference

jQuery16209061950308827726_1334683337112({"query":{"pages":{"11039790":{"pageid":11039790,"ns":0,"title":"Animal","revisions":[{"*":"{{Redirect|Animalia}}
{{Other uses}}
{{pp-semi-protected|small=yes}}
{{pp-move-indef}}
{{Taxobox
| color = {{taxobox color|[[animalia]]}}
| name = Animals
| fossil_range = [[Ediacaran]] u2013 Recent {{fossilrange|610|0|}}
| image = Animal diversity.png
| image_width = 250px
| domain = [[Eukaryota]]
{{Taxobox_norank_entry | taxon = [[Opisthokonta]]}}
{{Taxobox_norank_entry | taxon = [[Holozoa]]}}
{{Taxobox_norank_entry | taxon = [[Filozoa]]}}
| regnum = '''Animalia'''
| regnum_authority = [[Carolus Linnaeus|Linnaeus]], [[Systema Naturae|1758]]
| subdivision_ranks = [[Phylum|Phyla]]
| subdivision =
* '''Subkingdom [[Parazoa]]'''
** [[Sponge|Porifera]]
** [[Placozoa]]
* '''Subkingdom [[Eumetazoa]]'''
** '''[[Radiata]] (unranked)'''
*** [[Ctenophora]]
*** [[Cnidaria]]
** '''[[Bilateria]] (unranked)'''
*** [[Orthonectida]]
*** [[Rhombozoa]]
*** [[Acoelomorpha]]
*** [[Chaetognatha]]
*** '''Superphylum [[Deuterostomia]]'''
**** [[Chordata]]
**** [[Hemichordata]]
**** [[Echinoderm]]ata
**** [[Xenoturbellida]]
**** [[Vetulicolia]] [[extinction|u2020]]
*** '''[[Protostomia]] (unranked)'''
**** '''Superphylum [[Ecdysozoa]]'''
***** [[Kinorhyncha]]
***** [[Loricifera]]
***** [[Priapulida]]
***** [[Nematoda]]
***** [[Nematomorpha]]
***** [[Lobopodia]]
***** [[Onychophora]]
***** [[Tardigrada]]
***** [[Arthropoda]]
**** '''Superphylum [[Platyzoa]]'''
***** [[Platyhelminthes]]
***** [[Gastrotricha]]
***** [[Rotifera]]
***** [[Acanthocephala]]
***** [[Gnathostomulida]]
***** [[Micrognathozoa]]
***** [[Cycliophora]]
**** '''Superphylum [[Lophotrochozoa]]'''
***** [[Sipuncula]]
***** [[Hyolitha]] [[extinction|u2020]]
***** [[Nemertea]]
***** [[Phoronida]]
***** [[Bryozoa]]
***** [[Entoprocta]]
***** [[Brachiopoda]]
***** [[Mollusca]]
***** [[Annelida]]
***** [[Echiura]]
}}

'''Animals''' are a major group of multicellular, [[eukaryotic]] [[organism]]s of the [[Kingdom (biology)|kingdom]] '''Animalia''' or '''Metazoa'''. Their [[body plan]] eventually becomes fixed as they [[Developmental biology|develop]], although some undergo a process of [[metamorphosis]] later on in their life. Most animals are [[Motility|motile]], meaning they can move spontaneously and independently. All animals are also [[heterotroph]]s, meaning they must ingest other organisms or their products for [[sustenance]].

Most known animal [[phylum|phyla]] appeared in the fossil record as marine species during the [[Cambrian explosion]], about 542 million years ago."}]}}}})
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you want the actual html as it is displayed in the wikipage, use action=parse instead. And yes, the result objects are deeply nested. But no reason to loop over them!

  • the first property is always the action, here: query
  • you have requested properties of pages, so you will receive pages
  • which are keyed by their page id. This is the only step to use a loop
  • Each page object has certain properties (like a title), you're interested in the revisions
  • this is an array of revision objects, you need the only and first
  • the sourcetext property of a revision object is the *

So, just do it:

if (data && data.query && data.query.pages)
    var pages = data.query.pages;
else
    // error: No pages returned / other problems!
for (var id in pages) { // in your case a loop over one property
    if (pages[id].revisions && pages[id].revisions[0] && pages[id].revisions[0]["*"])
        var content = pages[id].revisions[0]["*"];
    else
        // error: No revision content returned for whatever reasons!
}
// use "content" variable here

Dont forget to check for the existance of each object! If you requested no pages, there will be no pages object; this is only the case when the pages "array" is empty. A page may be missing/invalid title or something else, so that is has no revisions. etc.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...