An open API service providing repository metadata for many open source software ecosystems.

Package Usage: pypi: trafilatura

Python package and command-line tool designed to gather text on the Web, includes all necessary discovery and text processing components to perform web crawling, downloads, scraping, and extraction of main texts, metadata and comments.
49 versions
Latest release: plus d'un an ago
71 dependent packages
2 070 897 downloads last month

View more package details: https://packages.ecosystem.code.gouv.fr/registries/pypi.org/packages/trafilatura

Dependent Repos 5

medialab/enlinkenment
Enrichment workflow for URL data

Size: 520 ko - Last synced: 5 jours ago - Pushed: plus de 2 ans ago

medialab/minet
A webmining CLI tool & library for python.

Size: 15,8 Mo - Last synced: 5 jours ago - Pushed: 8 jours ago

medialab/spsm-database

Size: 2,46 Mo - Last synced: 5 jours ago - Pushed: plus d'un an ago

gip-inclusion/data-inclusion
data·inclusion aggrège les données de l'insertion sociale et professionnelle

Size: 9,22 Mo - Last synced: 5 jours ago - Pushed: 5 jours ago

medialab/DeFacto 📦
Tools to enrich De Facto's database

Size: 16,9 Mo - Last synced: 5 jours ago - Pushed: plus de 2 ans ago