Azure OpenAI Service offers API access to powerful OpenAI language models like GPT-4o, enabling tasks such as content generation, summarization, natural language, and code translation.
In this post, I want to show you how easy it is to deploy a language model to an existing Azure OpenAI resource whenever new models become available.
Prerequisite
You need an active Azure subscription and access to the Azure OpenAI service, which you can request using this link. At the time I’m writing this, it took just a couple of hours for me to gain access, but it says it take up to 10 business days.
Creating an Azure OpenAI Resource
The first step is the creation of a resource. After logging into the Azure portal, select the Create a resource option.

Then search for Azure OpenAI, click on it, and then use the Create button.

In the new dialog, you will:
1- Select the subscription where the new resources will be located.
2- Create or select a resource group.
3- Set the region. NOTE: not every model type is available in every region. For access to gpt-4o models, you need to select East US 2 at the moment of writing this post.
4- Create a unique name for the resource.
5- Choose the pricing tier.
After this, you can leave other values as default and use the Next button to go forward, then the Create button to complete the resource creation.

The resource creation can take a couple of minutes, after which you will be redirected to a summary page for your deployment. Select the Go to resource button to work with your new Azure OpenAI resource.

Deploy the Model
Navigating through the Azure portal has recently seen some changes, particularly in managing AI model deployments. Users accustomed to finding model deployment options under the “Model deployments” section within their resource group will now see a prompt indicating that these features have moved to Azure OpenAI Studio.

Model Availability
In the updated Azure AI Studio, users can now access a comprehensive list of AI models, as shown in the image. It’s crucial to note that the availability of these models depends on the location selected during the creation step.

Now we can select the gpt-4o model and the Deploy option

When setting up a new deployment, users can configure several key parameters, including the model version, deployment type, and name. A crucial aspect to consider is the rate limit, which dictates the number of tokens processed per minute. In this example, the rate limit is set to 1,000 tokens per minute, translating to 6 requests per minute (RPM). Additionally, the dynamic quota feature, which is enabled here, allows Azure to automatically adjust the rate limits based on demand and resource availability.

Test the Model in Playground
Once our new deployment is ready, and we can use it in the playground to test it.

The Playground in Azure OpenAI Studio provides a dynamic environment for testing and fine-tuning AI models. The image shows a practical example of using the deployment “rj-gpt-4o” to generate responses based on user prompts.
The setup panel on the left allows users to define system messages, use templates, and add examples to guide the AI’s responses. The configuration panel on the right lets users select their deployment and adjust session settings, such as the number of past messages included and the current token count. Here, we prompted the model to print the first 100 prime numbers, showcasing the model’s ability to handle mathematical queries effectively. This interactive playground is invaluable for developers to experiment with different configurations and optimize their AI models for various applications.

Keys and API Access
In the Azure portal, securing and managing access to your AI services is crucial, and the “Keys and Endpoint” section facilitates this process effectively.
As shown in the image, within the “rj-gpt-demo” resource group, users can navigate to the “Keys and Endpoint” option highlighted by red arrows in the left sidebar. This section displays essential information, including two access keys and an endpoint URL, which are vital for making API calls to Azure AI services. Users are advised to store these keys securely and regenerate them regularly to maintain security, with options to regenerate each key separately, ensuring uninterrupted access during the update process.

Testing with cURL
Using the Endpoint and Key provided, we can test our model using cURL asking the same question we used in the playground with the following command (replace YOUR_API_KEY with either of the 2 keys)
curl "https://rj-gpt-demo.openai.azure.com/openai/deployments/rj-gpt-4o/chat/completions?api-version=2024-02-15-preview" -H "Content-Type: application/json" -H "api-key: YOUR_API_KEY" -d "{\"messages\": [{\"role\":\"user\",\"content\":\"Find the first 100 prime numbers.\"}], \"max_tokens\": 200}"
This will return a JSON response with the numbers and additional information regarding the call itself.
Conclusion
Deploying new language models using Azure OpenAI Service is a streamlined process that begins with securing access and creating the necessary resources in the Azure portal. Once the resources are set up, navigating the updated Azure AI Studio allows for selecting and configuring models like GPT-4o, ensuring optimal performance and scalability.
The interactive playground provides a valuable environment for testing and fine-tuning model responses, while securing access through the “Keys and Endpoint” section is critical for maintaining secure API interactions. By following these steps, developers can leverage Azure’s robust infrastructure to integrate powerful AI capabilities into their applications efficiently and securely.


