John Snow Labs
John Snow Labs NLP & LLM ecosystem includes software libraries for state-of-the-art AI at scale, Responsible AI, No-Code AI, and access to over 20,000 models for Healthcare, Legal, Finance, etc. (John Snow Labs 的 NLP & LLM 生態系統包含用於大規模最先進 AI、負責任的 AI、無程式碼 AI 的軟體庫,以及訪問超過 20,000 個醫療保健、法律、金融等模型的權限。)
Models are loaded with nlp.load and spark session is started >with nlp.start() under the hood. For all 24.000+ models, see the John Snow Labs Model Models Hub (模型透過 nlp.load 載入,並且 spark session 在底層使用 nlp.start() 啟動。 關於所有 24,000+ 個模型,請參閱 John Snow Labs Model Models Hub)
Setting up (設定)
%pip install --upgrade --quiet johnsnowlabs
# If you have a enterprise license, you can run this to install enterprise features
# from johnsnowlabs import nlp
# nlp.install()
Example (範例)
from langchain_community.embeddings.johnsnowlabs import JohnSnowLabsEmbeddings
Initialize Johnsnowlabs Embeddings and Spark Session (初始化 Johnsnowlabs Embeddings 和 Spark Session)
embedder = JohnSnowLabsEmbeddings("en.embed_sentence.biobert.clinical_base_cased")
Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews. (定義一些範例文字。 這些可以是您想要分析的任何文件 - 例如,新聞文章、社群媒體貼文或產品評論。)
texts = ["Cancer is caused by smoking", "Antibiotics aren't painkiller"]
Generate and print embeddings for the texts . The JohnSnowLabsEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification. (為文字產生並列印嵌入。 JohnSnowLabsEmbeddings 類別會為每個文件產生一個嵌入,它是文件內容的數字表示。 這些嵌入可用於各種自然語言處理任務,例如文件相似性比較或文本分類。)
embeddings = embedder.embed_documents(texts)
for i, embedding in enumerate(embeddings):
print(f"Embedding for document {i+1}: {embedding}")
Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query. (為單一段文字產生並列印嵌入。 您也可以為單一段文字產生嵌入,例如搜尋查詢。 這對於資訊檢索等任務非常有用,在這些任務中您想要找到與給定查詢相似的文件。)
query = "Cancer is caused by smoking"
query_embedding = embedder.embed_query(query)
print(f"Embedding for query: {query_embedding}")
Related (相關內容)
- Embedding model conceptual guide (嵌入模型 概念指南)
- Embedding model how-to guides (嵌入模型 操作指南)