1. 产品
  2.   Aspose.Words
  3.   Aspose.Words FOSS for Python

适用于 Python 的 Aspose.Words FOSS

从 Python 将 Word 文档转换为 PDF、Markdown 和文本 — 免费且开源,无需 Microsoft Office。

用于 Word 文档转换的开源 Python 库

Aspose.Words FOSS for Python 是一个基于 MIT 许可证的 Python 库,用于加载和转换 Word 文档。它可以读取 DOCX、DOC、RTF、TXT 和 Markdown 文件,并将其导出为 PDF、Markdown 和纯文本,无需 Microsoft Office 或任何专有运行时。

该库提供了一个用于加载文件的 Document 类,以及一个 save() 方法,该方法接受 SaveFormat 常量或诸如 PdfSaveOptionsMarkdownSaveOptions 等保存选项对象,以实现细粒度的输出控制。

使用 pip install aspose-words-foss>=26.4.0 进行安装。该库要求 Python 3.10 或更高版本,并依赖 olefilefpdf2pydantic。它采用 MIT 许可证,并在 GitHub 上完全开源。

Document Conversion

  • Multi-format input: Load documents from DOCX, DOC, RTF, TXT, and Markdown formats via the Document class.
  • PDF export: Convert any input document to PDF using SaveFormat.PDF or PdfSaveOptions.
  • Markdown export: Export to Markdown with SaveFormat.MARKDOWN or MarkdownSaveOptions.
  • Text extraction: Extract plain text content from documents using Document.get_text().
  • Plain text export: Save documents as plain text using SaveFormat.TEXT.

Where Aspose.Words FOSS Can Be Used

  • Document pipelines: Convert uploaded Word documents to PDF in backend services.
  • Content extraction: Extract text from DOCX or DOC files for indexing and search.
  • Format migration: Batch-convert legacy DOC/RTF archives to modern Markdown or PDF.
  • CI/CD automation: Generate PDF reports from Markdown or DOCX templates in build pipelines.

Save Options and Customization

  • PdfSaveOptions: Control PDF output settings when converting documents to PDF.
  • MarkdownSaveOptions: Configure Markdown-specific export options.
  • SaveFormat constants: Use SaveFormat.MARKDOWN, SaveFormat.PDF, and SaveFormat.TEXT for quick conversion.
  • Document readers: Dedicated readers for DOC, RTF, TXT, and Markdown input formats.

Developer Experience

Aspose.Words FOSS installs with pip install aspose-words-foss. Runtime dependencies (olefile, fpdf2, pydantic) are installed automatically.

The API is straightforward: load a Document from a file path, then call save() with a target path and format. For advanced control, pass a save-options object instead of a format constant. The library is MIT-licensed, open-source on GitHub, and requires Python 3.10 or later.

Convert DOCX to Markdown

Load a Word document and save it as Markdown in two lines of code.

import aspose.words_foss as aw

doc = aw.Document("input.docx")  # or .doc, .rtf, .txt, .md
doc.save("output.md", aw.SaveFormat.MARKDOWN)

Convert DOCX to PDF

Export a Word document to PDF format.

import aspose.words_foss as aw

doc = aw.Document("input.docx")
doc.save("output.pdf", aw.SaveFormat.PDF)

Extract Text from a Document

Read all text content from a Word document.

import aspose.words_foss as aw

doc = aw.Document("input.docx")
text = doc.get_text()

常见问题

What license does Aspose.Words FOSS for Python use?

Aspose.Words FOSS for Python is released under the MIT license. You can use, modify, and distribute it in commercial and personal projects.

How do I install Aspose.Words FOSS for Python?

Install via pip with pip install aspose-words-foss>=26.4.0. Requires Python 3.10 or later.

Which document formats are supported?

The library reads DOCX, DOC, RTF, TXT, and Markdown files and exports to PDF, Markdown, and plain text.

How do I convert a DOCX file to PDF?

Load with Document("input.docx") and call doc.save("output.pdf", SaveFormat.PDF).

  

支持和学习资源

 中文