Creating an AI Assistant That Produces PDF Files

Most large language models (LLMs) output text strings. Some advanced multimodal models — which go beyond text-only LLMs — can generate images, audio, and video. However, they typically do not produce output in specific file formats such as DOCX, PDF, or technical drawings.

In some scenarios, the final output must be in one of these formats. To close this gap, we can rely on the LLM's strong text-generation capability to produce content in text form, then transform that text into the desired format. The transformation can be done via function calling or MCP (Model Context Protocol).

For example: HTML text → PDF, or a Mermaid diagram written in Markdown → a rendered graph.
This article shows the steps to produce PDF files via function calling.


Create an executable function (Ruby)

1) Define what the function can do and its inputs.

{
  "type": "function",
  "function": {
    "name": "html_to_pdf",
    "description": "Convert HTML code to a PDF file and attach it to the conversation.",
    "parameters": {
      "type": "object",
      "properties": {
        "html": {
          "type": "string",
          "description": "HTML code to convert to PDF (maximum 100000 characters)"
        },
        "filename": {
          "type": "string",
          "description": "Optional filename for the PDF (e.g., 'report.pdf'). If not provided, a UUID-based filename will be generated."
        }
      },
      "required": [ "html" ]
    }
  }
}

2) Implement the actual HTML-to-PDF conversion. This example uses WickedPdf.

html = ""
pdf_generator = WickedPdf.new
pdf_binary = pdf_generator.pdf_from_string(
    html,
    page_size: "A4",
    margin: {
      top: 15,
      bottom: 20,
      left: 15,
      right: 15
    },
    encoding: "UTF-8",
    # Footer with page numbers
    footer: {
      center: "Page [page] of [topage]",
      font_size: 9
    },
    # Security options
    disable_javascript: true,
    no_stop_slow_scripts: true,
    disable_external_links: true,
    enable_local_file_access: false,
    load_error_handling: "ignore"
)

filename = "#{SecureRandom.uuid}.pdf" if filename.blank?
filename = "#{filename}.pdf" unless filename.end_with?(".pdf")

3) Then attach the file to your conversation or message.

message.files.attach(
  io: StringIO.new(pdf_binary),
  filename: filename,
  content_type: "application/pdf"
)
class Message < ApplicationRecord
  has_many_attached :files
end

4) And inform the LLM that the function calling is completed.

{
  success: true,
  message: "PDF generated successfully and attached to the conversation"
}

5) This sample code does not perform HTML sanitization. In real use cases, you should validate and sanitize input HTML before conversion.

The ActionView's SanitizeHelper is a good start.


Use the Function with AI Models

Next, choose a model that outputs text and supports function calling. OpenAI GPT-5 Mini matches these criteria.

1) Enable function calling via tool calls. Add these fields in the request payload:

{
	max_tool_calls: 5,
	tools: [
	  {
	    type: "function",
        name: func_definition[:function][:name],
        description: func_definition[:function][:description],
        parameters: func_definition[:function][:parameters]
	  }
	]
}

2) When the model needs to call a function, a series of events like the following will be returned:

- response.output_item.added event with type set to function_call (Reference)
- response.output_item.done event
- response.function_call_arguments.delta event (Reference)
- response.function_call_arguments.done event

The first two events request execution, and the next two provide the input data. Please ensure that you store the call_id from the first event.

3) Call the function to produce the PDF file with the provided input data, and return the function's result.

function_call_results << {
  call_id: "#{call_id}",
  type: "function_call_output",
  output: {
    "success": true,
    "message": "PDF generated successfully and attached to the conversation"
  }
}

4) Send a request to the AI model, with the function_call_results added. At this point, your conversation histories should be similar to:

  • user
  • reasoning
  • function_call
  • function_call_output

5) If the function call output contains a positive result, the AI model would usually respond with a final success message (may vary depending on your system prompt).

  • user
  • reasoning
  • function_call
  • function_call_output
  • assistant

The result

1) We test the feature by requesting a recipe in PDF format:

2) The PDF is generated correctly. Sample:


The same approach can be applied to other types of documents by modifying the function implementation.


AI Summary AI Summary
gpt-5-mini-2025-08-07 2025-10-05 15:33:27
This post describes how to equip an AI assistant to produce PDF files by having the model emit text or HTML and invoking an executable function to convert it into a PDF. It outlines a Ruby implementation that generates and attaches the PDF to a conversation, advises sanitizing input HTML, and explains enabling model function-calling, capturing call events and results. A recipe PDF test demonstrates the flow, and the approach can be adapted to other document formats.
Chrome On-device AI 2025-10-20 00:23:22

Share Share this Post