In today’s blog post, we take a look at restricting access to the Azure IMDS endpoint on an Azure Kubernetes Service (AKS) cluster with Cilium using the BYOCNI approach.
The Instance Metadata Service (IMDS) endpoint, also known as short IMDS, can be called directly from every Azure VM or VMSS instance via the following command.
curl -s -H Metadata:true --noproxy "*" "http://169.254.169.254/metadata/instance?api-version=2021-02-01"
When we query the IMDS from an Azure VM or VMSS instance, we retrieve important information, as seen in the output below, that is valuable for an adversary.
❯ kubectl run -it --image=ubuntu idms -- /bin/bash
root@idms:/# apt update && apt install curl jq -y
root@idms:/# curl -s -H Metadata:true --noproxy "*" "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | jq .
{
"compute": {
"azEnvironment": "AzurePublicCloud",
"customData": "",
...
"location": "northeurope",
"name": "aks-default-13458874-vmss_68",
"offer": "",
"osProfile": {
"adminUsername": "azureuser",
"computerName": "aks-default-13458874-vmss00001W",
"disablePasswordAuthentication": "true"
},
"osType": "Linux",
...
"resourceId": "/subscriptions/<REDACTED>/resourceGroups/rg-aks-azst-1-nodes/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-13458874-vmss/virtualMachines/68",
"securityProfile": {
"secureBootEnabled": "false",
"virtualTpmEnabled": "false"
},
...
"storageProfile": {
"dataDisks": [],
"imageReference": {
"id": "/subscriptions/<REDACTED>/resourceGroups/AKS-AzureLinux/providers/Microsoft.Compute/galleries/AKSAzureLinux/images/V3gen2/versions/202505.14.0",
...
},
"osDisk": {
...
"diskSizeGB": "128",
...
},
...
},
"subscriptionId": "<REDACTED>",
...
"tagsList": [
...
{
"name": "aks-managed-enable-imds-restriction",
"value": "false"
},
{
"name": "aks-managed-kubeletIdentityClientID",
"value": "<REDACTED>"
},
...
{
"name": "aks-managed-orchestrator",
"value": "Kubernetes:1.33.0"
},
...
{
"name": "aks-managed-ssh-access",
"value": "LocalUser"
}
],
"userData": "",
...
"vmId": "8b1c6ff8-c408-469e-b35b-6c2bb426e430",
"vmScaleSetName": "aks-default-13458874-vmss",
"vmSize": "Standard_D4ads_v5",
"zone": "2"
},
"network": {
"interface": [
{
"ipv4": {
"ipAddress": [
{
"privateIpAddress": "10.10.0.6",
"publicIpAddress": ""
}
],
"subnet": [
{
"address": "10.10.0.0",
"prefix": "20"
}
]
},
"ipv6": {
"ipAddress": []
},
"macAddress": "7C1E52781468"
}
]
}
}
Additionally, the IMDS endpoint is used for retrieving authentication tokens when using an Azure managed identity to access other Azure services.
-> https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?WT.mc_id=AZ-MVP-5000119&tabs=linux
-> https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-to-use-vm-token?WT.mc_id=AZ-MVP-5000119#get-a-token-using-http
Hence, it is important in a multi-layer security approach to restrict access to the IMDS endpoint only to pods that require access.
In our example, this will be Fluent Bit, which uses an Azure managed identity to ingest log data into an Azure Data Explorer cluster.
As a first step, we provide additional metadata information to Cilium about the IMDS endpoint by using a Cilium CIDR group.
Then we observe the network traffic either via the Hubble CLI or Hubble UI before working on our Cilium cluster-wide network policy.
Observe network traffic
In my test cluster, I have a couple of namespaces.
❯ kubectl get namespaces NAME STATUS AGE cert-manager Active 116d cilium-secrets Active 116d default Active 116d go-webapp Active 116d grafana-alloy Active 62d istio-config Active 116d istio-system Active 116d kube-node-lease Active 116d kube-public Active 116d kube-system Active 116d logging Active 98d
For most of them, I know that they either depend on or do not depend on the IMDS endpoint. As a due diligence check, I chose my application namespace, go-webapp, where I do not expect any surprises.
I used the following Hubble CLI command, hubble observe -P --to-ip 169.254.169.254 --not --from-identity host -f, to observe the entire cluster traffic with a filter to exclude the host identity, as I am not interested in node-level traffic. Then I did a rollout restart of my application.
To my surprise, there is traffic to the IMDS endpoint as seen in the output and the screenshot below.
❯ hubble observe -P --to-ip 169.254.169.254 --not --from-identity host -f Jul 15 21:49:39.668: go-webapp/go-webapp-847857f68-2qlnd:40132 (ID:14226) -> 169.254.169.254:80 (ID:16777218) policy-verdict:all EGRESS ALLOWED (TCP Flags: SYN) Jul 15 21:49:39.668: go-webapp/go-webapp-847857f68-2qlnd:40132 (ID:14226) -> 169.254.169.254:80 (ID:16777218) to-network FORWARDED (TCP Flags: SYN) Jul 15 21:49:39.668: go-webapp/go-webapp-847857f68-2qlnd:40132 (ID:14226) -> 169.254.169.254:80 (ID:16777218) to-network FORWARDED (TCP Flags: ACK) Jul 15 21:49:39.668: go-webapp/go-webapp-847857f68-2qlnd:40132 (ID:14226) -> 169.254.169.254:80 (ID:16777218) to-network FORWARDED (TCP Flags: ACK, PSH) ...
My application does not contact the IMDS endpoint. I know that the only component left is the Istio sidecar proxy.
A quick research reveals that the Istio sidecar proxy uses the IMDS endpoint to determine the cloud platform it is running on.
-> https://github.com/istio/istio/blob/master/pkg/bootstrap/platform/azure.go
What do we do? Still allow every pod to access the IMDS endpoint or restrict it?
When we restrict it, the flow logs will be flooded with policy deny and dropped traffic entries.
❯ hubble observe -P --to-ip 169.254.169.254 --not --from-identity host -f Jul 15 21:54:30.797: go-webapp/go-webapp-8568cffcd4-l5s48:37378 (ID:14226) <> 169.254.169.254:80 (ID:16777218) policy-verdict:L3-Only EGRESS DENIED (TCP Flags: SYN) Jul 15 21:54:30.797: go-webapp/go-webapp-8568cffcd4-l5s48:37378 (ID:14226) <> 169.254.169.254:80 (ID:16777218) Policy denied by denylist DROPPED (TCP Flags: SYN) Jul 15 21:54:31.138: go-webapp/go-webapp-8568cffcd4-tqsfh:51964 (ID:14226) <> 169.254.169.254:80 (ID:16777218) policy-verdict:L3-Only EGRESS DENIED (TCP Flags: SYN) Jul 15 21:54:31.138: go-webapp/go-webapp-8568cffcd4-tqsfh:51964 (ID:14226) <> 169.254.169.254:80 (ID:16777218) Policy denied by denylist DROPPED (TCP Flags: SYN) ...
Apply fix to Istio configuration
Hence, we need to fix the root cause of it, and that is Istio. Fortunately, Istio provides a configuration option for it to provide the cloud platform to the sidecar proxy configuration. When you now think we set it to azure and everything is back to normal, I have to disappoint you. It disables only one part of the sidecar proxy that causes the IMDS traffic. We must set it to none.
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: istiocontrolplane
spec:
...
meshConfig:
defaultConfig:
proxyMetadata:
CLOUD_PLATFORM: "none"
...
After applying the new Istio configuration, we check the network traffic again. No IMDS traffic anymore by the Istio sidecar proxy.
Restrict network traffic
Now we can focus on the Cilium cluster-wide network policies. Yes, we need two of them to achieve our goal. The reason for that is that we disable the default egress deny behavior. It is a common best practice when applying cluster-wide network policies.
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: azure-imds-deny
annotations:
description: "Deny traffic to Azure IMDS"
labels:
app.kubernetes.io/part-of: cilium
area: network-security
spec:
endpointSelector:
matchExpressions:
- key: k8s:io.kubernetes.pod.namespace
operator: NotIn
values:
- kube-system
- logging
- grafana-alloy
enableDefaultDeny:
egress: false
ingress: false
egressDeny:
- toCIDRSet:
- cidrGroupRef: azure-imds
As seen above, we use an egress deny rule to block traffic to the IMDS endpoint from all namespaces except kube-system, logging (Fluent Bit), and grafana-alloy.
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: azure-imds-allow
annotations:
description: "Allow traffic to Azure IMDS"
labels:
app.kubernetes.io/part-of: cilium
area: network-security
spec:
endpointSelector:
matchExpressions:
- key: k8s:io.kubernetes.pod.namespace
operator: In
values:
- kube-system
- logging
- grafana-alloy
enableDefaultDeny:
egress: false
ingress: false
egress:
- toCIDRSet:
- cidrGroupRef: azure-imds
toPorts:
- ports:
- port: "80"
protocol: TCP
rules:
http:
- method: "GET"
path: "/metadata"
In the second network policy, we use an egress rule to restrict the traffic to only HTTP GET requests and the /metadata path for the namespaces kube-system, logging (Fluent Bit), and grafana-alloy. With that, we ensure insights into traffic to the IMDS endpoint for allowed namespaces, at the same time, we restrict it.
After applying both policies, I did a rollout restart of the Fluent Bit daemon set to observe the traffic.
❯ hubble observe -P --to-ip 169.254.169.254 --not --from-identity host -f Jul 15 21:58:44.291: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) policy-verdict:L3-L4 EGRESS ALLOWED (TCP Flags: SYN) Jul 15 21:58:44.291: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) to-proxy FORWARDED (TCP Flags: SYN) Jul 15 21:58:44.291: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) to-proxy FORWARDED (TCP Flags: ACK) Jul 15 21:58:44.291: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) to-proxy FORWARDED (TCP Flags: ACK, PSH) Jul 15 21:58:44.292: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) http-request FORWARDED (HTTP/1.1 GET http://169.254.169.254:80/metadata/identity/oauth2/token?api-version=2021-02-01&resource=https://api.kusto.windows.net) Jul 15 21:58:44.300: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) to-proxy FORWARDED (TCP Flags: ACK, FIN) Jul 15 21:58:44.300: logging/fluent-bit-qg7vq:52482 (ID:21152) -> 169.254.169.254:80 (ID:16777218) to-proxy FORWARDED (TCP Flags: ACK) ...
Fluent Bit can reach the IMDS endpoint for authentication against Azure Data Explorer using an Azure managed identity and ingest the log data.
Summary
Restricting traffic to the Azure IMDS endpoint is quite straightforward. Depending on how you use cluster-wide network policies, you either need two network policy definitions or only one. The latter one requires you to use the default egress deny mode, but spares you another policy. Both approaches have their pros and cons. Besides that, keep in mind that unexpected things can happen, like the IMDS traffic by the Istio sidecar proxy.
As always you can find the example configurations on my GitHub repository.
-> https://github.com/neumanndaniel/kubernetes/tree/master/cilium/azure-imds
-> https://github.com/neumanndaniel/kubernetes/tree/master/cilium/metadata-information
