![]() |
ノート/テキストマイニング/twitter-tweepy1https://pepper.is.sci.toho-u.ac.jp:443/pepper/index.php?%A5%CE%A1%BC%A5%C8%2F%A5%C6%A5%AD%A5%B9%A5%C8%A5%DE%A5%A4%A5%CB%A5%F3%A5%B0%2Ftwitter-tweepy1 |
![]() |
訪問者数 1193 最終更新 2012-03-02 (金) 09:39:16
参考
インストール
いちおう、ソースからちゃんとインストール。 https://github.com/tweepy/tweepy/blob/master/docs/install.rst にあるように、ダウンロードした後、python setup.py install する。(yumでインストールしたら、バージョンがおかしくて、うまく動かなかった)
#wget -4 pypi.python.org/packages/source/t/tweepy/tweepy-1.8.tar.gz --2012-03-01 16:50:20-- http://pypi.python.org/packages/source/t/tweepy/tweepy-1.8.tar.gz pypi.python.org をDNSに問いあわせています... 82.94.164.168 pypi.python.org|82.94.164.168|:80 に接続しています... 接続しました。 HTTP による接続要求を送信しました、応答を待っています... 200 OK 長さ: 21393 (21K) [application/octet-stream] `tweepy-1.8.tar.gz' に保存中 100%[================================================>] 21,393 36.1K/s 時間 0.6s 2012-03-01 16:50:21 (36.1 KB/s) - `tweepy-1.8.tar.gz' へ保存完了 [21393/21393] #tar -zxvf tweepy-1.8.tar.gz tweepy-1.8/ tweepy-1.8/PKG-INFO tweepy-1.8/README tweepy-1.8/setup.cfg tweepy-1.8/setup.py tweepy-1.8/tweepy/ tweepy-1.8/tweepy/__init__.py tweepy-1.8/tweepy/api.py tweepy-1.8/tweepy/auth.py tweepy-1.8/tweepy/binder.py tweepy-1.8/tweepy/cache.py tweepy-1.8/tweepy/cursor.py tweepy-1.8/tweepy/error.py tweepy-1.8/tweepy/models.py tweepy-1.8/tweepy/oauth.py tweepy-1.8/tweepy/parsers.py tweepy-1.8/tweepy/streaming.py tweepy-1.8/tweepy/utils.py tweepy-1.8/tweepy.egg-info/ tweepy-1.8/tweepy.egg-info/dependency_links.txt tweepy-1.8/tweepy.egg-info/PKG-INFO tweepy-1.8/tweepy.egg-info/SOURCES.txt tweepy-1.8/tweepy.egg-info/top_level.txt tweepy-1.8/tweepy.egg-info/zip-safe # cd tweepy-1.8 # python setup.py install running install install_dir /usr/lib/python2.6/site-packages/ running bdist_egg running egg_info writing tweepy.egg-info/PKG-INFO writing top-level names to tweepy.egg-info/top_level.txt writing dependency_links to tweepy.egg-info/dependency_links.txt reading manifest file 'tweepy.egg-info/SOURCES.txt' writing manifest file 'tweepy.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py creating build creating build/lib creating build/lib/tweepy copying tweepy/api.py -> build/lib/tweepy copying tweepy/auth.py -> build/lib/tweepy copying tweepy/parsers.py -> build/lib/tweepy copying tweepy/error.py -> build/lib/tweepy copying tweepy/__init__.py -> build/lib/tweepy copying tweepy/binder.py -> build/lib/tweepy copying tweepy/oauth.py -> build/lib/tweepy copying tweepy/cache.py -> build/lib/tweepy copying tweepy/utils.py -> build/lib/tweepy copying tweepy/models.py -> build/lib/tweepy copying tweepy/cursor.py -> build/lib/tweepy copying tweepy/streaming.py -> build/lib/tweepy creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/api.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/auth.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/parsers.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/error.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/__init__.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/binder.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/oauth.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/cache.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/utils.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/models.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/cursor.py -> build/bdist.linux-x86_64/egg/tweepy copying build/lib/tweepy/streaming.py -> build/bdist.linux-x86_64/egg/tweepy byte-compiling build/bdist.linux-x86_64/egg/tweepy/api.py to api.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/auth.py to auth.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/parsers.py to parsers.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/error.py to error.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/__init__.py to __init__.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/binder.py to binder.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/oauth.py to oauth.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/cache.py to cache.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/utils.py to utils.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/models.py to models.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/cursor.py to cursor.pyc byte-compiling build/bdist.linux-x86_64/egg/tweepy/streaming.py to streaming.pyc creating build/bdist.linux-x86_64/egg/EGG-INFO copying tweepy.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying tweepy.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying tweepy.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying tweepy.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying tweepy.egg-info/zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO creating dist creating 'dist/tweepy-1.8-py2.6.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) Processing tweepy-1.8-py2.6.egg creating /usr/lib/python2.6/site-packages/tweepy-1.8-py2.6.egg Extracting tweepy-1.8-py2.6.egg to /usr/lib/python2.6/site-packages Adding tweepy 1.8 to easy-install.pth file Installed /usr/lib/python2.6/site-packages/tweepy-1.8-py2.6.egg Processing dependencies for tweepy==1.8 Finished processing dependencies for tweepy==1.8
これで、インストールは終わり。
ノート/テキストマイニング/twitter-1 にあるような、public timelineへのアクセスの例。
>>> import tweepy >>> public_tweets = tweepy.api.public_timeline() >>> for tweet in public_tweets: ... print tweet.text ...
出力は
??Y en que consiste la prueba? En matarme de hambre. Extranare a mi Hermana se va a Mexico #laquieromucho 欲と目標の違いがわかりませんー(*_*)!もりもりです。はんぐりー精神は捨ててません。 今日もあたしはもりもりです\(^o^)/ RT @Noaax_ RT @_Yessin: Als ik kortaf doe is het vaak een hint van "praat niet tegen me".. http://t.co/P2kLFl7p |Знакомства красная поляна Happy Pills by norahjonesofficial via #soundcloud http://t.co/0ORUoYCV Es muy insoportable, en serio. Hampir nangiis td gara2 denger itu (??????_??????) #whoop RT @BreadwinnerPINO: Aint nothing to Buy a bitch a new PURSE!!!! ヨミマジ孤独 They say you gotta go out to the world to get what you want? But ill much rather things come to me so ill know it's real. ;) Munafuck bgt lo ! Bullshit semua omongan lo ! apa jamban lagi ? huahauhahua I am watching you from here #TB I'm up!! Buh not finally... Buenos dias gentee... Hoy jueves el peor dia de la semana para mi!! Con etica que la odio de verdad :( Pantai Seminyak Masuk 15 Destinasi Populer di Dunia http://t.co/B31F0Hla "@_MsIttyBittyEsh: @itsDaYunginNhea my psychology paper" subject is on??? Настоящий артист - всегда артист, даже на сцене. http://t.co/bTCGr6a4 ???????????????????? ; ???????????? ! . . . ?????? ?????????? ??/ ???????????????? ?? なんかなー
では、肝心のStreamインターフェースを使ってみよう。Streamインタフェースを使うには、認証(O-Auth)が必要である。
#!/usr/bin/env python import tweepy class StreamListener(tweepy.StreamListener): # def on_data(self, data): # if data.startswith("{"): # print data def on_status(self, status): print status.text consumer_key="アカウントの情報" consumer_secret="アカウントの情報" access_token="アカウントの情報" access_token_secret="アカウントの情報" auth1 = tweepy.auth.OAuthHandler(consumer_key, consumer_secret) auth1.set_access_token(access_token, access_token_secret) stream = tweepy.Stream(auth1, StreamListener(), timeout=None) stream.sample()
APIはtweepyのバージョンによって違っている様子もあるので、ソースコード https://github.com/tweepy/tweepy/tree/master/tweepy でよく確認する方がよさそうである。
たとえば、Streamの引数の形が、結構違っていたりしたようだ。結局
class Stream(object): def __init__(self, auth, listener, **options): self.auth = auth self.listener = listener self.running = False self.timeout = options.get("timeout", 300.0) self.retry_count = options.get("retry_count") self.retry_time = options.get("retry_time", 10.0) self.snooze_time = options.get("snooze_time", 5.0) self.buffer_size = options.get("buffer_size", 1500)
を見て納得。位置パラメータはauthとlistenerだけ。また、timeoutのデフォルト値は300(単位は秒か?)だ。(〜〜このバージョンでは!!)
上記プログラムの出力例は(sampleで出力しているので、フィルタ無しで全部出力)
@KhairulAsnawi3 ha? Having multiple policies works in certain circumstances, like when there are multiple market failures #MEILecture ?????????? "@_MarkMajor: @Coatnaayy Oh okay. My mistake then lol. I thought it was cuz I suddenly switched to normal speak haha" ahaha nopeeee :) 次世代グラフィックスエンジン「Paradox Engine」を初公開 http://t.co/WCgK5oyJ @k_ahmed1 sir, I've just seen a man walkin down the road and he was like your mirror image! Aduuuh RT @aik_deathripper: @arieotong @ibelbinjamil @vijayaulia aku kalah kuat ma kalian trnyta..haha :D #aintthatthetruth @Hot30Countdown whats the matinee one direction show in sydney?!?!?! :O @xAmiruLIzzaTx Aaaargh!Mentally sick. nti smula2 aq oyk. aq srabuk nih. BRITTIANY KNOCKED OUT Gk tau org nyasar "@irinnaapinkaan: Siapa kaleng rombeng? RT @fidelahusent Tapi ada kaleng rombeng gupek bnr ("¬_¬) "@irinnaapinkaan: Ayoo" neng @safitriidiah klo kacamata gw uda jadii mungkin gw ambil nnti aja deiih,,nnti gw mampir ngsh uang kmrin yakk,, lo mau pergii kan yakk,, @gon_am jajajaj pues manana no creas q vas a vnir aqui para tomart un desyuno vips e...? Jaja http://t.co/Mal0KVcV this is a qr code of facebook ;) @CodeeQR RT @HesGenuine: #WhyDontYou admit that this is how you take your shower (; http://t.co/xuIbSwkP RT @justinbieber: @AhoyBieber we did this together. cant stop smiling. just the beginning i promise. if u are with me i will never leave ... http://t.co/NmQzLd2w dune vsti скачать @iliyore Lol Good night silly Illy...Lol Julio Cesar: "Momento no, ma alzeremo la testa": ??Dal ritiro della sua nazionale, Julio Cesar ha discusso pure d... http://t.co/BZrhBFgT 昨日から4色獰猛エクス回してて思ったけど、すっかり2色デッキの洗練された感じに慣れてしまった感 RT @AlRASHDAN2: ?????????? ?????????? ?????????????????? ???????????????????? ?????? ?????????????????? ???? ?????? ???????????????? A dormir se ha dicho pasen linda noche bsitos ... I'm at Tecno Rete (Via Porrettana 278, Casalecchio di Reno) http://t.co/zkgJGi90 ピーピル行きたいけどVoodooに行こうかな…転換時間勉強できるし(笑) 収納ボックスは大体2000円くらいかー Me quiero ir a un colegio onda santa teresa. Donde son mil pibes en secundaria y no se conocen todos con todos. Menos chismes
さて、漢字ツイートだけフィルタして出力とか、それをDBに貯めようとか、はまた明日。 ⇒ ノート/テキストマイニング/twitter-tweepy2