Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

尝试从 twitter 拉取 tweets 的数据 #5

Open
iannono opened this issue Apr 2, 2024 · 10 comments
Open

尝试从 twitter 拉取 tweets 的数据 #5

iannono opened this issue Apr 2, 2024 · 10 comments
Assignees

Comments

@iannono
Copy link

iannono commented Apr 2, 2024

请尝试通过 twitter 的 api 实现用户 tweets 的拉取。 如果需要, 可以尝试下面的 key:

@0xRichardH
Copy link
Collaborator

0xRichardH commented Apr 17, 2024

https://developer.twitter.com/en/docs/authentication/oauth-2-0/bearer-tokens

curl -u "$API_KEY:$API_SECRET_KEY" \
  --data 'grant_type=client_credentials' \
  'https://api.twitter.com/oauth2/token'

@0xRichardH
Copy link
Collaborator

@0xRichardH
Copy link
Collaborator

image

@0xRichardH
Copy link
Collaborator

0xRichardH commented Apr 23, 2024

@0xRichardH
Copy link
Collaborator

0xRichardH commented Apr 25, 2024

用爬虫定期拉 3到5个人的 twitter feed scrape the tweet list

Using twscrape

import asyncio
import aiofiles
import datetime
from twscrape import API, gather
from twscrape.logger import set_log_level


async def main():
    api = API()  # or API("path-to.db") - default is `accounts.db`

    # ADD ACCOUNTS (for CLI usage see BELOW)
    await api.pool.add_account(
        "[email protected]", "supersecurepassword", "[email protected]", "mail_pass1"
    )
    await api.pool.login_all()

    # list info
    list_id = 1464100857402769409  # https://twitter.com/i/lists/1464100857402769409
    # await gather(api.list_timeline(list_id))

    current_date = datetime.datetime.now().strftime("%Y-%m-%d")
    filename = f"tweets_{list_id}_{current_date}.txt"
    async with aiofiles.open(filename, "a") as file:
        async for tweet in api.list_timeline(list_id, limit=100):
            await file.write(str(tweet.json()) + "\n")
            print(tweet.json())
            # print(tweet.id, tweet.user.username, tweet.rawContent)
            print("=====================================")

if __name__ == "__main__":
    asyncio.run(main())

tweets_1464100857402769409_2024-05-03.txt


Use https://socialdata.gitbook.io/docs/twitter-lists/retrieve-list-details

curl "https://api.socialdata.tools/twitter/lists/show?id=1464100857402769409" \
-H 'Authorization: Bearer API_KEY' \
-H 'Accept: application/json' \

@0xRichardH
Copy link
Collaborator

0xRichardH commented Apr 29, 2024

Tech Spec

Background

    1. 想了解最近 tweets , 但是 tweets 的内容太多看不过来。(summeriaze)
    1. 想通过提问的方式,知道关联 tweets 内容。如,bitcoin 最近怎么样? (knowledge base)

Design
image

@0xRichardH
Copy link
Collaborator

定时下载: https://twitter.com/i/lists/1762857078656532875

  • 每天跑一次

@fmb-chin
Copy link

ssh [email protected]

新主机

i-0adad346a9973642c, 57.180.63.58, 172.31.23.201

@0xRichardH
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants