There are many ways to access Generative AI and LLM features. This article demonstrates several of them.
AI Services
AI companies such as Google (Gemini), OpenAI (ChatGPT), Microsoft (Copilot), and Anthropic (Claude) provide services that allow users to interact with Generative AI. These services often include a free tier with usage caps, and paid plans to unlock additional capacities and capabilities. This is currently the most common way people interact with Generative AI.
Run locally
Tools like Ollama, AnythingLLM, and LM Studio enable users to run LLM models locally on their machines. The performance depends on the hardware specifications of your machine and the LLM models you select.
1) Download Ollama at https://ollama.com/download
2) Select a suitable model to run. The download will start automatically if you have not used the model before.
ollama run phi4
3) Start using it!
% ollama run phi4
>>> hey
Hello! How can I assist you today? If you have any questions or need information, feel free to let me know.
>>>
4) Access Ollama via API if needed.
5) View all downloaded models using the command:
% ollama list
NAME ID SIZE MODIFIED
phi4:latest ac896e5b8b34 9.1 GB About an hour ago
gemma2:latest ff02c3702f32 5.4 GB 5 days ago
llama3.1:8b 46e0c10c039e 4.9 GB 7 days ago
llama3.2:3b a80c4f17acd5 2.0 GB 7 days ago
Embedded Models in Browsers
Some browsers now include embedded AI models. For more details, refer to the previous articles:
https://calvin.my/posts/api-updates-on-chrome-on-device-ai-nov-2024
Metered API Access
Proprietary models, such as OpenAI's GPT-4o, GPT-o1, Claude Haiku, and Gemini Pro can be accessed via APIs provided by the respective service providers. These APIs are typically metered, and usage is billed based on the volume of requests.