Azure
Cloud
Last synthesized: 2026-02-13 02:47 | Model: gpt-5-mini
Table of Contents
1. Azure Portal / VM Visibility and Role Access Problems
2. Provisioning Subscriptions, Resource Groups, VMs and GPU Reservations for Projects
3. Secret Management and Expired Client Secrets Causing Service Failures
4. Azure OpenAI / API Key Provisioning, Endpoint Access and Cost Attribution Constraints
5. Log Analytics / Sentinel Ingestion Cost Analysis and Commitment Tier Selection
6. VPN / Certificate Gating and SSH Key Authorization Blocking Access to Azure VMs (GPU Instances)
7. Missing Cost Management Data After Contract Renewal Due to Point of Cost Analysis Reset
8. Migrating Application Backups to an Azure Storage Account and Stopping Old Backup Targets
9. Provisioning and Onboarding of a New Microsoft Tenant for a Course Integration
10. Transient Microsoft / Azure Service and Network Infrastructure Outages Causing Access Failures
11. UniFLOW Online cloud-printing integration and inconsistent card/PIN authentication
12. SFTP access to Azure Blob blocked by missing credentials and Intune app-install restrictions
13. Automated Course Feed membership removals linked to Teams app update and Logic Apps workflow
14. Unexpected Microsoft notice about unused subscriptions and subscription ID legitimacy concerns
1. Azure Portal / VM Visibility and Role Access Problems
Solution
Support restored Portal visibility and management by resolving identity, role-assignment, group membership, licensing, resource-scope, and service-principal provisioning issues across Azure/Entra and management-group/subscription boundaries. Actions that resolved cases included: confirming the correct sign-in account and tenant; creating or provisioning missing Azure AD groups and MyAccess packages; adding users to existing VM-access or security groups; and assigning requested roles (Owner/Contributor/Reader) to user accounts or security groups at the appropriate subscription, management-group, or resource scope. For Microsoft.Capacity reservation failures, access was restored by assigning the Reservation Purchaser role (or equivalent purchaser/owner rights) at the correct scope (reservationOrder/parent resource or subscription) when an Owner role on the individual reservation resource was insufficient; this addressed errors such as "User is not an owner or reservation purchaser on subscription '
2. Provisioning Subscriptions, Resource Groups, VMs and GPU Reservations for Projects
Solution
New subscriptions and resource groups were created on behalf of teams when requesters lacked tenant- or subscription-level permissions and were moved under the correct management group when required. For billing‑sensitive subscriptions (for example dedicated OpenAI o1 discounts) Microsoft contacts and contract verification were used before creation. Virtual machines were provisioned (frequently via Terraform by the service provider) to match requested sizes, names and exact-clone requirements, and identical environments were replicated when requested. User accounts and Office 365 licenses were created and VM credentials were handed over via 1Password for secure access. GPU work included replacing an expiring A100 reservation by provisioning Standard_NC80adis_H100_v5 instances, initiating multi‑year reservation procurement, shutting down expiring instances to avoid further charges, and—when requested—attaching new VMs to separate billing accounts to isolate costs. Where VPN authentication (Entra ID) blocked access, temporary certificate‑based VPN access or new VPN endpoints were provided as interim measures; reachability was confirmed (including via Azure Bastion where used). Azure Lab Services issues were resolved by updating automation runbooks to stop retrying VM creation for removed users, requesting vCPU quota increases (example: 808 → 880) and escalating to Microsoft Support when necessary. Placement and cost trade‑offs for teaching and demo environments were documented (VM specs and monthly estimates, Azure Database vs VM‑based MySQL, or on‑prem/Metal Managed/VMware alternatives). Hosting or sponsorship requests (including low‑cost teaching services) were evaluated through security/privacy/legal/brand review and declined when they did not meet organizational requirements. A formal process for professor‑owned VMs was documented and implemented: teams recorded subscription and contact details in Confluence (including professor contact, duration/term, and cost center), term‑end notifications were configured to confirm continued need, and networking/access rules, public IP policy, DNS handling (example domain iu-lab.org), authentication and key‑based SSH requirements, Windows Defender/update expectations, firewall restrictions, and backup/retention considerations were defined. For student trial accounts IT confirmed that Azure free trials require a separate valid credit card per account, researched whether the university could provision student trials, requested details about required resources and cost limits from instructors, and provided an alternate sign‑up instruction that reportedly does not require a credit card (staff‑provided link not tested by IT). Portal access problems and student HPC or account requests were forwarded to the infrastructure team and recorded when provisioning was declined or delayed.
3. Secret Management and Expired Client Secrets Causing Service Failures
Solution
An Azure Key Vault was provisioned and bound to the project environment (for example kv-iu-cama-mpl-we1 in rg-iu-cama-mpl-we1-prd under subscription 2f7cd2bc-ae88-4028-8d65-6c990e34df20). Access was granted to the requester and relevant service accounts, secrets were moved from flow/code into the Key Vault, and expired client secrets were replaced so dependent services resumed operation. Key Vault endpoints and access principals were provided to application owners to replace hard‑coded credentials.
4. Azure OpenAI / API Key Provisioning, Endpoint Access and Cost Attribution Constraints
Solution
Azure OpenAI endpoints and model deployments (GPT‑4, GPT‑4o, GPT‑3.5/GPT‑3.5 Turbo/GPT‑35‑Turbo) were provisioned and API keys/endpoints were issued to requesters. When reported throughput risked exceeding shared limits, dedicated Azure OpenAI endpoints were allocated (for example iuaiacademydedicated.openai.azure.com in the Canada datacenter) and capacity sizing used customer‑provided TPM and RPM estimates (customer examples included ~229,500 TPM or ~45,000 TPM and RPM targets of 3–5) to resolve timeouts and rate‑limit failures. Where provisioning requests were missing required metadata, support collected the necessary details (team name, service/project description/use case, target model(s), OKRs, whether students interact in real time, cost center, and geographic/GDPR requirements) before issuing keys. Unified‑endpoint API keys were created for both Version 1 and Version 2 when requested; keys were distributed via secure channels (SAFE one‑time links, controlled SharePoint) and, in documented cases, delivered to a recipient’s private email with confirmation over internal communications when appropriate. When one‑time links were lost, support regenerated and resent keys and offered Playground access to confirm connectivity. Platform constraints affecting cost attribution were documented: Azure OpenAI usage metrics did not split consumption by API key and streaming responses did not include a Usage field. To produce per‑bot/daily cost reports despite those constraints, per‑bot unified‑endpoint keys were created (GCD auto‑generated keys when the bot key field was left empty created an association between the auto‑generated key and the bot name) and middleware‑level request logging plus Finance‑DB integration was used to map consumption to cost centers. Endpoint/unified‑endpoint details and the issued keys were communicated to requesters; teams confirmed bots continued to function after key changes.
5. Log Analytics / Sentinel Ingestion Cost Analysis and Commitment Tier Selection
Solution
Ingestion usage was analyzed across days and months (noting averages, lows ~80GB/day and 26 days below 100GB), and commitment tier pricing estimates from Microsoft were compared against pay-as-you-go forecasts (the 100GB/day commitment estimate was ~10,758.28€/month). The cost comparison identified that a lower fixed commitment (e.g., the 100GB tier) would have offered the most cost savings given the observed usage distribution versus higher fixed tiers or continuing pay-as-you-go.
6. VPN / Certificate Gating and SSH Key Authorization Blocking Access to Azure VMs (GPU Instances)
Solution
Access failures were resolved by provisioning the required corporate VPN credentials and ensuring the target VM accepted the correct access method. A VPN specialist provisioned and delivered the client certificate and password via the Safe App; once the user installed and configured the VPN client the network-gated VMs became reachable. In addition, SSH access failures were resolved by adding the user’s SSH public key to the VM’s allowed keys. For Azure Lab incidents, support confirmed the lab enforced an "RDP from Internet"/VPN policy and clarified that RDP only applied to Windows VMs; failed RDP attempts against Linux template VMs (which did not have an RDP service) were expected. Traceroute hops/timeouts were observed but the Azure endpoint was reachable by ping, and reported auto-shutdown errors were investigated and not found to be the root cause of the access issues. After certificate delivery, VPN installation, and ensuring the correct remote-access service/keys were present, users regained RDP/SSH access.
7. Missing Cost Management Data After Contract Renewal Due to Point of Cost Analysis Reset
Solution
The Point of Cost Analysis setting had been overwritten during the Azure Contracts renewal. The Point of Cost Analysis was restored to the correct configuration, current cost charts and a cost export were provided to the user for immediate review, and users were informed that propagation could take up to 48 hours. A follow-up verification confirmed the Cost Analysis view was working again.
8. Migrating Application Backups to an Azure Storage Account and Stopping Old Backup Targets
Solution
Backups were migrated by disabling/stopping the legacy cron job and updating the backup script to target the new Azure Storage Account. Relevant mount/ fstab entries were adjusted so the new storage became the active backup destination, and Veeam backups were left intact to continue protecting the system during and after the migration.
9. Provisioning and Onboarding of a New Microsoft Tenant for a Course Integration
Solution
A dedicated Entra ID tenant was provisioned for the UFred project using the chosen namespace (ufredstudy.onmicrosoft.com). Cross-tenant access was enabled for IU and LIBF staff via guest access patterns, tenant-onboarding automation and group templates were applied to meet group-creation policy needs, and the Course Feed integration was activated and scheduled to support mid‑March GoLive testing. Required data-handling and contractual checks were completed as part of the onboarding, and the provisioning timeline was aligned to the requested testing and GoLive dates.
10. Transient Microsoft / Azure Service and Network Infrastructure Outages Causing Access Failures
Solution
The interruption was identified as a Microsoft-side network infrastructure incident. Microsoft applied networking configuration changes and performed failovers to alternative network paths; services subsequently recovered and normal access was restored. Internal support validated service health, communicated the outage to affected users, and confirmed that no local configuration changes were required once Microsoft remediated the incident. The major outage event was recorded with an approximate start time of 30 July 2024 ~11:45–12:00 UTC for tracking and post‑incident review.
11. UniFLOW Online cloud-printing integration and inconsistent card/PIN authentication
Solution
Investigation and vendor engagement (Printvision/UniFLOW support) established that interactive admin sign-in was part of the UniFLOW Online provisioning flow for Entra ID enterprise application user objects, and that card authentication behavior depended on the authentication method and device firmware. Vendor documentation and configuration guidance were provided that explained PIN issuance and notification behavior (PIN emails were not sent by default in the examined configuration), the PIN length/policy options available in the UniFLOW admin settings, and the job-retention (print-job lifetime) defaults and where to change them. Where card PIN prompts were inconsistent, the vendor identified device firmware/configuration mismatches and supplied firmware/setting recommendations to enforce PIN prompts consistently.
12. SFTP access to Azure Blob blocked by missing credentials and Intune app-install restrictions
Solution
The request remained unfulfilled after investigation: no usable credentials were provided for the storage account SFTP service user and the endpoint could not be accessed from the user’s workstation because Company Portal/Intune blocked the required client installs. The ticket record showed no further activity from the requester and was closed without provisioning access. The actionable findings recorded were that a service owner needed to supply valid SFTP credentials (or an appropriate SAS/managed identity method) and that Intune policy or app deployment approval needed to permit WinSCP/FileZilla installation before a user-level connection could be established.
13. Automated Course Feed membership removals linked to Teams app update and Logic Apps workflow
Solution
Investigation traced the removals to an automated workflow (Azure Logic Apps/Microsoft Flow) that used the service account to sync Course Feed membership and that behaved incorrectly after a Teams app update. The problematic flow runs were modified/disabled and the membership logic in the integration was corrected so that course assignments were respected. After the workflow was updated and redeployed the repeated removals stopped and Course Feed memberships remained stable.
14. Unexpected Microsoft notice about unused subscriptions and subscription ID legitimacy concerns
Solution
The incident was escalated to Azure support and tenant administrators to verify the subscription IDs against their tenant(s). Microsoft/Support confirmation and portal validation were used to establish whether each listed subscription was legitimately associated with the tenant or was a stale/unused subscription. Where ownership was confirmed, teams were informed of the timelines and either reactivated or removed the subscriptions; where items could not be traced, a support case was opened with Microsoft to investigate ownership and prevent unintended termination. The event was documented as a legitimate portal notification pending admin verification rather than an immediate automated termination.