to navigate

to select

to close

On this page

NLP with Hugging Face Transformers

Use Hugging Face Transformers for text classification, sentiment analysis, named entity recognition, and text generation in Python.

Hugging Face Transformers provides pre-trained models for NLP tasks — eliminating the need to train from scratch for most applications.

Installation

  pip install transformers torch sentencepiece

Sentiment Analysis — Zero Setup

  from transformers import pipeline

classifier = pipeline("sentiment-analysis")

results = classifier([
    "I love this product!",
    "Terrible experience, would not recommend.",
    "It was okay, nothing special.",
])

for text, result in zip(
    ["I love...", "Terrible...", "It was okay..."],
    results,
):
    print(f"{result['label']}: {result['score']:.3f}")

Text Classification with Custom Model

  from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def classify(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    label = model.config.id2label[probs.argmax().item()]
    confidence = probs.max().item()
    return label, confidence

label, conf = classify("This movie was absolutely fantastic!")
print(f"{label} ({conf:.2f})")

Named Entity Recognition (NER)

  ner = pipeline("ner", grouped_entities=True)

text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
entities = ner(text)

for entity in entities:
    print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.2f})")

Text Generation

  generator = pipeline("text-generation", model="gpt2")

prompt = "Python is a programming language that"
output = generator(prompt, max_length=50, num_return_sequences=1)
print(output[0]["generated_text"])

Question Answering

  qa = pipeline("question-answering")

context = """
Python was created by Guido van Rossum and first released in 1991.
It emphasizes code readability and supports multiple programming paradigms.
"""

result = qa(question="Who created Python?", context=context)
print(f"Answer: {result['answer']} (confidence: {result['score']:.2f})")

Fine-Tuning on Custom Data

For domain-specific tasks, fine-tune a pre-trained model:

  from transformers import TrainingArguments, Trainer
from datasets import load_dataset

dataset = load_dataset("imdb")  # movie reviews

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    eval_strategy="epoch",
    logging_steps=100,
)

# Define model, tokenizer, data collator, then:
# trainer = Trainer(model=model, args=training_args, train_dataset=..., eval_dataset=...)
# trainer.train()

See Hugging Face docs for full fine-tuning tutorials.

Model Hub

Browse 500,000+ models at huggingface.co/models:

Task	Example Model
Sentiment	`distilbert-base-uncased-finetuned-sst-2-english`
Translation	`Helsinki-NLP/opus-mt-en-fr`
Summarization	`facebook/bart-large-cnn`
NER	`dslim/bert-base-NER`
Code generation	`bigcode/starcoder2-7b`

Production Tips

Cache models locally — first download is slow
Use GPU when available — device=0 in pipeline
Batch inputs for throughput
Set max_length to control memory usage
Consider distilled models (DistilBERT) for faster inference

PyTorch Basics — underlying tensor framework
PyTorch Training — custom training loops
Scikit-learn Pipelines — classical ML alternative

Hugging Face democratized NLP — tasks that required research teams now take five lines of Python.

Scikit-learn Pipelines & Model Selection

Build ML pipelines with Scikit-learn — …

Project: Todo CLI App

Build a command-line todo application …

NLP with Hugging Face Transformers

Installation link

Sentiment Analysis — Zero Setup link

Text Classification with Custom Model link

Named Entity Recognition (NER) link

Text Generation link

Question Answering link

Fine-Tuning on Custom Data link

Model Hub link

Production Tips link

Related Chapters link