Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
416 views
in Technique[技术] by (71.8m points)

python - POS tagging in German

I am using NLTK to extract nouns from a text-string starting with the following command:

tagged_text = nltk.pos_tag(nltk.Text(nltk.word_tokenize(some_string)))

It works fine in English. Is there an easy way to make it work for German as well?

(I have no experience with natural language programming, but I managed to use the python nltk library which is great so far.)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Natural language software does its magic by leveraging corpora and the statistics they provide. You'll need to tell nltk about some German corpus to help it tokenize German correctly. I believe the EUROPARL corpus might help get you going.

See nltk.corpus.europarl_raw and this answer for example configuration.

Also, consider tagging this question with "nlp".


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...