Skip to content

Issue arash 20260403 193002 484673


title: "## Audit: biblio OpenAlex API usage vs actual API capabilities status: done created: 2026-04-03 updated: 2026-04-03 timestamp: 20260403-193002-484673 tags: [issue] source: agent-observation project_primary: projio capture_id: 20260403-193000-d83a54 confidence: 1.0 transcript_file: /storage2/arash/worklog/workflow/captures/20260403-193000-d83a54/transcript.txt


Audit: biblio OpenAlex API usage vs actual API capabilities

Now that we have the OpenAlex source code (openalex-elastic-api), docs, and tutorials indexed in RAG, audit biblio's OpenAlex integration for correctness and gaps.

Scope

  1. Endpoint/filter audit — Compare packages/biblio/src/biblio/openalex/openalex_client.py and packages/biblio/src/biblio/discovery.py against the actual API source in .projio/codio/mirrors/ourresearch--openalex-elastic-api/. Check:
  2. Are we using the right endpoints and filter syntax?
  3. Are we missing useful query parameters (sort, group_by, sample, select)?
  4. Is our cursor pagination implementation correct?
  5. Rate limiting / polite pool compliance (mailto parameter)

  6. Data model audit — Compare what biblio extracts from OpenAlex responses (_extract_work, _extract_author in author_search.py) vs what's actually available. Check if we should extract:

  7. Topics/keywords (OpenAlex has a rich topic hierarchy)
  8. Funders, grants, SDGs
  9. Author affiliations history (not just last_known_institutions)
  10. counts_by_year (citation trends)
  11. related_works field

  12. Caching audit — Check packages/biblio/src/biblio/openalex/openalex_cache.py against API best practices from the tutorials.

Method

Use rag_query with corpus "codelib" to search the indexed OpenAlex repos. Read the relevant biblio source files. Produce a findings document at docs/specs/biblio/openalex-audit.md with: - Current state (what biblio does) - API reality (what's available) - Gaps (what we should fix/add) - Priority ranking

Key files

  • packages/biblio/src/biblio/openalex/openalex_client.py
  • packages/biblio/src/biblio/openalex/openalex_resolve.py
  • packages/biblio/src/biblio/openalex/openalex_cache.py
  • packages/biblio/src/biblio/author_search.py
  • packages/biblio/src/biblio/discovery.py
  • packages/biblio/src/biblio/graph.py