跳至主要內容

Gutenberg

Project Gutenberg 是一個免費電子書的線上圖書館。

此筆記本涵蓋如何將 Gutenberg 電子書的連結載入到我們可以下游使用的文件格式中。

from langchain_community.document_loaders import GutenbergLoader
API 參考:GutenbergLoader
loader = GutenbergLoader("https://www.gutenberg.org/cache/epub/69972/pg69972.txt")
data = loader.load()
data[0].page_content[:300]
'The Project Gutenberg eBook of The changed brides, by Emma Dorothy\r\n\n\nEliza Nevitte Southworth\r\n\n\n\r\n\n\nThis eBook is for the use of anyone anywhere in the United States and\r\n\n\nmost other parts of the world at no cost and with almost no restrictions\r\n\n\nwhatsoever. You may copy it, give it away or re-u'
data[0].metadata
{'source': 'https://www.gutenberg.org/cache/epub/69972/pg69972.txt'}

此頁面是否有幫助?