Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking missing source metadata #32

Open
2 of 10 tasks
GiorgioBrux opened this issue Nov 21, 2024 · 5 comments
Open
2 of 10 tasks

Tracking missing source metadata #32

GiorgioBrux opened this issue Nov 21, 2024 · 5 comments

Comments

@GiorgioBrux
Copy link
Contributor

GiorgioBrux commented Nov 21, 2024

Creating this issue to track feed categories pending source metadata analysis.

Other categories were dealt with in #30

@azdanov
Copy link
Contributor

azdanov commented Nov 24, 2024

Could you provide info on how to find the metadata? I could help with Estonia

@azdanov
Copy link
Contributor

azdanov commented Nov 24, 2024

[
  {
    "country": "Estonia",
    "organization": "ERR News",
    "domains": ["err.ee"],
    "description": "ERR News is the Estonian, English and Russian language news service of Estonian Public Broadcasting, providing daily coverage of Estonian and Baltic regional news, politics, culture and current affairs.",
    "owner": "Estonian Public Broadcasting",
    "typology": "State Funded Media"
  },
  {
    "country": "Estonia",
    "organization": "Postimees",
    "domains": ["postimees.ee"],
    "description": "Postimees is Estonia's oldest and largest daily newspaper, founded in 1857. It provides comprehensive coverage of national and international news, politics, business, and culture through Estonian, English and Russian language editions.",
    "owner": "MM Grupp",
    "typology": "Private Media"
  },
  {
    "country": "Estonia", 
    "organization": "Delfi",
    "domains": ["delfi.ee"],
    "description": "Delfi is one of Estonia's largest news portals providing news coverage in Estonian, English, Lithuanian, Latvian, Polish and Russian languages. The platform offers real-time news updates, opinion pieces, and multimedia content covering local and international affairs.",
    "owner": "Ekspress Grupp",
    "typology": "Private Media"
  },
  {
    "country": "Estonia",
    "organization": "Äripäev",
    "domains": ["aripaev.ee"],
    "description": "Äripäev is Estonia's leading business newspaper and news portal focused on business, financial and economic news coverage. Founded in 1989, it provides in-depth reporting and analysis of Estonian and international business developments.",
    "owner": "Bonnier Group",
    "typology": "Private Media"
  }
]

@azdanov
Copy link
Contributor

azdanov commented Nov 24, 2024

[
  {
    "country": "Belgium",
    "organization": "EU Observer",
    "domains": ["euobserver.com"],
    "description": "EUobserver is an independent online newspaper established in 2000 that provides daily news coverage focused on the European Union and European affairs. It specializes in EU policy and politics reporting for a professional audience.",
    "owner": "EUobserver ASBL",
    "typology": "Private Media"
  },
  {
    "country": "United Kingdom",
    "organization": "London School of Economics Blog",
    "domains": ["blogs.lse.ac.uk"],
    "description": "The LSE's European Politics and Policy blog is an academic blog covering European politics, economics, society and public policy. It provides analysis and insights from researchers and experts in European studies.",
    "owner": "London School of Economics and Political Science",
    "typology": "Private Media"
  },
  {
    "country": "Belgium",
    "organization": "Brussels Morning",
    "domains": ["brusselsmorning.com"],
    "description": "Brussels Morning is an online newspaper focused on EU politics and policy, providing daily coverage of European affairs, international relations, and Brussels-based EU institutions.",
    "owner": "",
    "typology": "Private Media"
  },
  {
    "country": "Hungary",
    "organization": "Daily News Hungary",
    "domains": ["dailynewshungary.com"],
    "description": "Daily News Hungary is an English-language news website covering Hungarian politics, business, culture and current affairs for an international audience.",
    "owner": "",
    "typology": "Private Media"
  }
]

@GiorgioBrux
Copy link
Contributor Author

GiorgioBrux commented Nov 24, 2024

This is the criteria I used for the existing metadata:

  • The country field indicates the primary location or headquarters of the organization/newspaper/source. Can be left empty if it's unclear.
  • The organization field contains either the company or newspaper name, whichever is more widely recognized and relevant.
  • The description field should provide a concise overview of the publication, including its coverage focus and, when there is general consensus, its editorial bias/political leaning. Wikipedia is a super useful "source" for this and that's what I used for the vast majority of sources.
  • The owner field traces the ownership structure to its ultimate controlling entities. Rather than listing intermediate corporate entities (e.g., "Reddit Inc."), it should identify the final shareholders or parent companies that have controlling interest. Sadly this information is often not easily available, so instead of playing detective for hours it's acceptable to list an intermediate corporation or leave the field empty.

Kagi Assistant with the search ON is capable of writing decent metadata usually (tested on claude-3-sonnet), but if you go that route you should double check (mostly the owner, country and the editorial bias) what it says.

Anyway your work looks good , you can PR it yourself so you can get credit :) @azdanov

@azdanov
Copy link
Contributor

azdanov commented Dec 4, 2024

I'd like to add the rest for Estonia, if I can find something. They're less popular than the previous ones. Adding here to keep track:

      "https://online.le.ee/feed/",
      "https://sonumitooja.ee/feed/",
      "https://kesknadal.ee/feed/",
      "https://koiduaeg.ee/feed/",
      "https://eestieest.com/feed/",
      "https://geenius.ee/feed/",
      "https://digi.geenius.ee/feed/",
      "https://uueduudised.ee/feed/",
      "https://eestiuudised.ee/feed/",
      "https://www.ohtuleht.ee/rss",
      "https://www.ituudised.ee/rss",
      "https://www.ehitusuudised.ee/rss",
      "https://www.laanevirumaauudised.ee/rss",
      "https://www.telegram.ee/feed",
      "https://www.tartu.ee/et/rss",
      "https://itfoorum.ee/feed/",
      "https://www.am.ee/rss.xml",
      "https://krupto.ee/feed/",
      "https://www.tervisekassa.ee/rss.xml",
      "https://mil.ee/feed/",
      "https://lounaeestlane.ee/feed/atom/",
      "https://www.internet.ee/eis/uudised.rss",
      "https://objektiiv.ee/feed/atom/",
      "https://feeds.feedburner.com/aripaev-rss",
      "https://feeds2.feedburner.com/delfiuudised",
      "https://feeds2.feedburner.com/delfimaailm",
      "https://feeds2.feedburner.com/forteuudised",
      "https://foorum.hinnavaatlus.ee/smartfeed.php?feed_type=ATOM1.0&limit=LF&sort_by=standard&forum=23&max_word_size=All",

Also, not sure how to handle delfi.ee, since it uses feedburner.com for their 4 RSS feeds at the bottom of the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants