@@ Ligne 1 : / Ligne 1 : @@
-== Compresser PDF (ghostscript) ==
+== '''🧰 <code>ghostscript</code> ''' ==
+'''<code>ghostscript</code>''' est un outil pour compresser des fichiers pdf
 <syntaxhighlight lang="bash" copy>
   gs -q -dSAFER -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook -sOUTPUTFILE=fichier_outout.pdf -f input.pdf
@@ Ligne 19 : / Ligne 20 : @@
   gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=fichier_outout_screen.pdf input.pdf
 </syntaxhighlight>
-== Convert img to pdf (img2pdf) ==
+== '''🧰 <code>img2pdf</code> ''' ==
 <syntaxhighlight lang="bash" copy>
@@ Ligne 29 : / Ligne 32 : @@
 </syntaxhighlight>
-== Fusionner pdf (pdftk) ==
+== '''🧰 <code>pdftk</code> ''' ==
+'''<code>pdftk</code>''' est un outil pour fusionner des fichiers pdf
+=== 🚀 '''Utilisation de base''' ===
+<syntaxhighlight lang="bash" copy>sudo apt-get install pdftk</syntaxhighlight>
+<syntaxhighlight lang="bash" copy>pdftk fichier1.pdf fichier2.pdf cat output fichier3.pdf</syntaxhighlight>
+<syntaxhighlight lang="bash" copy>pdftk mon-document.pdf output mon-document.comprimé.pdf compress</syntaxhighlight>
+source https://debian-facile.org/doc:editeurs:pdftk
+== '''🧰 <code>pdftotext</code> ''' ==
+'''<code>pdftotext</code>''' est un outil en ligne de commande issu de la suite **Poppler**
+permettant d’extraire le texte brut contenu dans un fichier PDF.
+Il est très utilisé pour l’analyse, l’indexation, la recherche ou le traitement
+automatisé de documents PDF.
+------------------------------------------------------------------------
+=== 📦 '''Installation de pdftotext''' ===
+<syntaxhighlight lang="bash">
+sudo apt install poppler-utils
+</syntaxhighlight>
+====  ℹ️ À noter ====
+Le paquet '''<code>poppler-utils</code>''' fournit plusieurs outils PDF utiles, dont :
+* <code>pdfinfo</code> → informations sur le PDF (pages, auteur, version…)
+* <code>pdftotext</code> → extraction du texte
+* <code>pdfimages</code> → extraction des images
+* <code>pdffonts</code> → liste des polices
+* <code>pdfseparate</code> / <code>pdfunite</code>
+-----
+=== 🚀 '''Utilisation de base''' ===
+<ol style="list-style-type: decimal;">
+<li><p>'''Extraire le texte vers un fichier''' :</p>
+<syntaxhighlight lang="bash">pdftotext document.pdf</syntaxhighlight></li>
+<li><p>'''Extraire le texte vers la sortie standard''' :</p>
+<syntaxhighlight lang="bash">pdftotext document.pdf -</syntaxhighlight></li>
+<li><p>'''Lire le texte directement avec <code>less</code>''' :</p>
+<syntaxhighlight lang="bash">pdftotext document.pdf - | less</syntaxhighlight></li></ol>
+-----
+=== 🔧 '''Options courantes''' ===
+{| class="wikitable"
+|-
+! Option
+! Description
+|-
+| <code>-layout</code>
+| Conserve la mise en page
+|-
+| <code>-raw</code>
+| Extraction brute
+|-
+| <code>-f &lt;n&gt;</code>
+| Page de début
+|-
+| <code>-l &lt;n&gt;</code>
+| Page de fin
+|-
+| <code>-nopgbrk</code>
+| Supprime les sauts de page
+|-
+| <code>-enc UTF-8</code>
+| Force l’encodage
+|-
+| <code>-help</code>
+| Aide complète
+|}
+-----
+=== 💡 '''Exemples pratiques''' ===
+<ul>
+<li><p>'''Extraire uniquement les pages 2 à 5''' :</p>
+<syntaxhighlight lang="bash">pdftotext -f 2 -l 5 document.pdf</syntaxhighlight></li>
+<li><p>'''Conserver la mise en page''' :</p>
+<syntaxhighlight lang="bash">pdftotext -layout document.pdf</syntaxhighlight></li>
+<li><p>'''Recherche rapide dans un PDF''' :</p>
+<syntaxhighlight lang="bash">pdftotext document.pdf - | grep "mot"</syntaxhighlight></li></ul>
+-----
+=== 📌 '''Pourquoi utiliser pdftotext ?''' ===
+✅ Extraction rapide du texte ✅ Outil léger et scriptable ✅ Parfait pour OCR / indexation ✅ Intégration facile dans des pipelines shell
+-----
+== '''🧰 <code>pdfinfo</code> ''' ==
+(fournie par `poppler-utils`)
+=== 1️⃣ Informations de base sur un PDF ===
+<syntaxhighlight lang="bash">pdfinfo document.pdf</syntaxhighlight>
+Exemple de sortie :
+<pre>Title:          Rapport annuel
+Author:         ACME Corp
+Creator:        LibreOffice
+Producer:       LibreOffice
+CreationDate:   Mon Jan 15 10:22:00 2024
+ModDate:        Tue Jan 16 09:10:00 2024
+Pages:          42
+Page size:      595 x 842 pts (A4)
+File size:      1234567 bytes
+PDF version:    1.7</pre>
+-----
+=== 2️⃣ Obtenir uniquement le nombre de pages ===
+<syntaxhighlight lang="bash">pdfinfo document.pdf | grep Pages</syntaxhighlight>
+Utile pour :
+* scripts
+* validation avant traitement
+* automatisation (batch PDF)
+-----
+=== 3️⃣ Connaître la taille et l’orientation des pages ===
+<syntaxhighlight lang="bash">pdfinfo document.pdf | grep "Page size"</syntaxhighlight>
+Résultat typique :
+<pre>Page size: 842 x 595 pts (A4)</pre>
+➡️ paysage (landscape)
+-----
+=== 4️⃣ Vérifier si le PDF est protégé ===
+<syntaxhighlight lang="bash">pdfinfo document.pdf | grep Encrypted</syntaxhighlight>
+Sortie possible :
+<pre>Encrypted: yes</pre>
+👉 Indique qu’un mot de passe est requis (à traiter ensuite avec <code>qpdf</code>)
+-----
+=== 5️⃣ Afficher les métadonnées complètes ===
+<syntaxhighlight lang="bash">pdfinfo -meta document.pdf</syntaxhighlight>
+Affiche les métadonnées XMP (XML), très utile pour :
+* audit
+* conformité
+* analyse forensique légère
+-----
+=== 6️⃣ Informations page par page ===
+<syntaxhighlight lang="bash">pdfinfo -box document.pdf</syntaxhighlight>
+Montre :
+* MediaBox
+* CropBox
+* BleedBox
+* TrimBox
+Utile en '''impression professionnelle'''.
+-----
+=== 7️⃣ Script shell : compter les pages ===
+<syntaxhighlight lang="bash">PAGES=$(pdfinfo document.pdf | awk '/Pages/ {print $2}')
+echo "Nombre de pages : $PAGES"</syntaxhighlight>
+-----
+=== 8️⃣ Tester si un PDF contient du texte exploitable ===
+<syntaxhighlight lang="bash">pdfinfo document.pdf && pdftotext document.pdf -</syntaxhighlight>
+➡️ si <code>pdftotext</code> ne sort rien → PDF scanné
+-----
+=== 9️⃣ Batch sur plusieurs fichiers ===
+<syntaxhighlight lang="bash">for f in *.pdf; do
+  echo "== $f =="
+  pdfinfo "$f" | grep Pages
+done</syntaxhighlight>
+-----
+=== ⚠️ Notes importantes ===
+* <code>pdfinfo</code> '''n’extrait pas le texte'''
+* il '''n’altère jamais''' le fichier
+* fonctionne même sur PDF protégés (lecture des infos seulement)
+-----
+=== 🧠 Résumé rapide ===
+{| class="wikitable"
+|-
+! Besoin
+! Commande
+|-
+| Infos générales
+| <code>pdfinfo file.pdf</code>
+|-
+| Pages
+| <code>grep Pages</code>
+|-
+| Sécurité
+| <code>grep Encrypted</code>
+|-
+| Métadonnées
+| <code>pdfinfo -meta</code>
+|-
+| Mise en page
+| <code>pdfinfo -box</code>
+|}
+-----
+== '''🧰 <code>qpdf</code> ''' ==
+'''<code>QPDF</code>''' est un outil en ligne de commande permettant de manipuler,
+inspecter et transformer des fichiers PDF.
+Il est couramment utilisé pour le chiffrement, le déchiffrement,
+la réparation et l’optimisation de PDF.
+------------------------------------------------------------------------
+=== 📦 '''Installation de qpdf''' ===
 <syntaxhighlight lang="bash">
- sudo apt-get install pdftk
+sudo apt install qpdf
- pdftk fichier1.pdf fichier2.pdf cat output fichier3.pdf
- pdftk mon-document.pdf output mon-document.comprimé.pdf compress
 </syntaxhighlight>
-source https://debian-facile.org/doc:editeurs:pdftk
+-----
-[[Catégorie:Linux]] [[Catégorie:Tools]] [[Catégorie: Terminal]]
+=== 🚀 '''Utilisation de base''' ===
+<ol style="list-style-type: decimal;">
+<li><p>'''Déchiffrer un PDF protégé par mot de passe''' :</p>
+<syntaxhighlight lang="bash">qpdf --password=secret --decrypt input.pdf output.pdf</syntaxhighlight></li>
+<li><p>'''Fusionner des PDF''' :</p>
+<syntaxhighlight lang="bash">qpdf --empty --pages a.pdf b.pdf -- output.pdf</syntaxhighlight></li>
+<li><p>'''Inspecter la structure d’un PDF''' :</p>
+<syntaxhighlight lang="bash">qpdf --check document.pdf</syntaxhighlight></li></ol>
+-----
+=== 🔧 '''Options courantes''' ===
+{| class="wikitable"
+|-
+! Option
+! Description
+|-
+| <code>--decrypt</code>
+| Supprime la protection
+|-
+| <code>--encrypt</code>
+| Chiffre un PDF
+|-
+| <code>--check</code>
+| Vérifie l’intégrité
+|-
+| <code>--pages</code>
+| Sélection de pages
+|-
+| <code>--linearize</code>
+| Optimisation web
+|-
+| <code>--show-npages</code>
+| Nombre de pages
+|-
+| <code>--help</code>
+| Aide complète
+|}
+-----
+=== 💡 '''Exemples pratiques''' ===
+<ul>
+<li><p>'''Extraire certaines pages''' :</p>
+<syntaxhighlight lang="bash">qpdf input.pdf --pages input.pdf 1-5 -- output.pdf</syntaxhighlight></li>
+<li><p>'''Optimiser un PDF pour le web''' :</p>
+<syntaxhighlight lang="bash">qpdf --linearize input.pdf output.pdf</syntaxhighlight></li>
+<li><p>'''Afficher le nombre de pages''' :</p>
+<syntaxhighlight lang="bash">qpdf --show-npages document.pdf</syntaxhighlight></li></ul>
+-----
+=== 📌 '''Pourquoi utiliser QPDF ?''' ===
+✅ Manipulation PDF avancée ✅ Sécurisation et déchiffrement ✅ Très fiable pour l’automatisation ✅ Aucun rendu graphique nécessaire
+-----
+== '''🧰 <code>ExifTool</code> ''' ==
+voir [[Logiciels_terminal#🧰_ExifTool|ExifTool]]
+[[Catégorie:Linux]] [[Catégorie:Tools]] [[Catégorie: Terminal Tools]]

« Linux tools PDF » : différence entre les versions

« Linux tools PDF » : différence entre les versions