Domain-Specific Small Language Models: Efficient AI for local deployment

Domain-Specific Small Language Models: Efficient AI for local deployment book cover

Domain-Specific Small Language Models: Efficient AI for local deployment

Author(s): Guglielmo Iozzia (Author)

  • Publisher Finelybook 出版社: Manning Publications
  • Publication Date 出版日期: May 26, 2026
  • Language 语言: English
  • Print length 页数: 376 pages
  • ISBN-10: 1633436705
  • ISBN-13: 9781633436701

Book Description

Get the eBook free when you register your print book at Manning.

When you need a language model to respond accurately and quickly about a specific field of knowledge, the sprawling capacity of a LLM may hurt more than it helps. This book teaches you to build generative AI models optimized for specific fields.

Perfect for cost- or hardware-constrained environments, Small Language Models (SLMs) train on domain specific data for high-quality results in specific tasks. In this book you’ll develop SLMs that can generate everything from Python code to protein structures and antibody sequences—all on commodity hardware.

In Domain-Specific Small Language Models you’ll discover:

• Model sizing best practices
• Open source libraries, frameworks, utilities and runtimes
• Fine-tuning techniques for custom datasets
• Hugging Face’s libraries for SLMs
• Running SLMs on commodity hardware
• Model optimization or quantization

Foreword by Matthew R. Versaggi.

About the technology

Small-footprint language models trained on custom data sets and hosted locally can perform as well as large generalist models in speed and accuracy, often at a fraction of the cost. Domain-Specific Small Language Models shows you how to build privacy-preserving and regulation-compliant SLMs for agentic systems, specialist applications, and deployment on the edge.

About the book

This is a practical book that shows you how to adapt pretrained open source models to your domain using transfer learning and parameter-efficient fine-tuning. You’ll learn to minimize cost through optimization and quantization, develop secure APIs to serve your models, and deploy SLMs on commodity hardware—including small devices. The hands-on examples include integrating SLMs into RAG systems and agentic workflows.

What’s inside

• ONNX and other quantization methods
• Integrate SLMs into end-to-end applications
• Deploy SLMs on laptops, smartphones, and other devices

About the reader

For AI engineers familiar with Python.

About the author

Guglielmo Iozziais a Director of AI and Applied Mathematics at Merck & Co. and a Distinguished Member of the American Society for Artificial Intelligence. He specializes in AI biomedical applications.

The technical editor on this book was Riccardo Mattivi.

Table of Contents

Part 1
1 Small language models
Part 2
2 Tuning for a specific domain
3 End-to-end transformer fine-tuning
4 Running inference
5 Exploring ONNX
6 Quantizing for your production environment
Part 3
7 Generating Python code
8 Generating protein structures
Part 4
9 Advanced quantization techniques
10 Profiling insights
11 Deployment and serving
12 Running on your laptop
13 Creating end-to-end LLM applications
14 Advanced components for LLM applications
15 Test-time compute and small language models

Editorial Reviews

Editorial Reviews

About the Author

Guglielmo Iozziais a Director, ML/AI and Applied Mathematics at MSD. He studied Electronic and Biomedical Engineering at the University of Bologna, has an extensive background in Software and ML/AI Engineering applied to real-life use cases across different industries, such as Biotech Manufacturing, Healthcare, Cloud Operations, and Cyber Security.

View on Amazon

下载地址

PDF, EPUB | 22 MB | 2026-05-21
下载地址 Download请完成验证以访问链接!
打赏
未经允许不得转载:finelybook » Domain-Specific Small Language Models: Efficient AI for local deployment

评论 抢沙发

觉得文章有用就打赏一下文章作者

您的打赏,我们将继续给力更多优质内容

支付宝扫一扫

微信扫一扫