Contents of this page is copied directly from IBM blog sites to make it Kindle friendly. Some styles & sections from these pages are removed to render this properly in 'Article Mode' of Kindle e-Reader browser. All the contents of this page is property of IBM.

Page 1|Page 2|Page 3|Page 4

3 Ways IBM Security Can Help Companies Handle a Ransomware Attack Security

6 min read

By:

John Le, Program Director, IBM Security Product Marketing

The 2023 Threat Intelligence Index reveals ransomware attacks are getting faster. IBM Security can help organizations protect and defend against them.

As organizations increasingly migrate higher volumes of data to the cloud, more of their sensitive data is at risk of being compromised. The IBM Security X-Force Threat Intelligence Index 2023 found that 17% of attacks involved ransomware. Even though the amount of ransomware attacks declined slightly over the year, these attacks have become much faster for threat actors to deploy. In fact, an X-Force study found that the time to execute attacks dropped 94% over two years. In 2019, the average ransomware deployment time was over 60 days; in 2021, the average ransomware deployment time was only 3.85 days. Clearly, with attackers moving faster, organizations must take a proactive, threat-driven approach to cybersecurity. They need modern solutions that detect real threats, secure their data and quickly respond to attacks.

How does ransomware work?

Ransomware attacks occur when hackers try to gain access to a computer or a device and then restrict users from accessing their own files until a ransom is paid. Sending a ransom note however, is the final stage of the attack. A ransomware attack can be summarized in four main steps:

  • Step 1: An attacker first tries to gain the access to the network—this could be months or even years before the attack takes place.
  • Step 2: Once they have the initial access, the attackers move laterally throughout the infrastructure to increase access privileges, say on an administrator level.
  • Step 3: Upon succeeding, they install the ransomware, encrypting files and sensitive data.
  • Step 4: It is only after this deployment that the ransomware is revealed to the victim.
How companies can handle a ransomware attack

Mitigating the damage of a cyberattack or a data breach and avoiding reputational damage are the top priorities for organizations. IBM Security recommends three ways to handle ransomware attacks:

  1. Early ransomware detection with powerful endpoint security for faster responses
  2. Leverage an encryption solution to help protect your sensitive data.
  3. Use AI-powered threat intel and analysis to generate high-fidelity alerts.
1. Early ransomware detection with powerful endpoint security for faster responses

Amplifying your cybersecurity with a strong endpoint detection and response (EDR) solution should be one of the top things to include in your incident response plan. Why? Because an EDR platform ensures that threats are contained before the devices get encrypted by ransomware. An AI-driven EDR solution can help detect and remediate known and unknown threats within seconds. Unlike antivirus software, EDR tools don’t rely on known signatures and can detect unknown threats.

An AI-driven EDR solution like IBM Security QRadar EDR relies on behavior detection, identifying anomalous activities like ransomware behavior in near real-time. This could be an unusual backup deletion or an unexpected encryption process that starts without warning and automatically terminates upon detection.

As new and sophisticated threats of ransomware variants emerge, IBM Security QRadar EDR uses data mining to hunt for threats that share behavior and functional similarities and responds as needed. This helps security teams quickly identify if new threats have entered an environment and understand “early warning signs” of an attack so weak spots or vulnerabilities can be effectively patched.

This post further details four ways IBM Security QRadar EDR can help you prevent ransomware, including detecting and responding to phishing attacks. Request a live product demo to see IBM Security QRadar EDR in action.

2. Leverage an encryption solution to help protect your sensitive data

The threat from ransomware has grown exponentially as the tools and capabilities available to threat actors have become more sophisticated. To help protect against such threats, organizations have started to deploy protection along every layer of the chain. Encryption of corporate data is one such defensive mechanism against ransomware. Encrypting data renders it useless to the threat actor looking to exfiltrate the information.

If the threat actor further encrypts encrypted data, the organizations can restore business operations by replacing it with a secured backup. Along with deploying encryption technologies, sophisticated organizations may also deploy security solutions that detect suspicious user behavior on their sensitive data spread across multiple clouds.

IBM Security Guardium Data Encryption is a robust encryption solution that combines standard encryption methods with dependable and adaptable capabilities, such as application allowlisting and intelligent access policies. Through application allowlisting, the solution permits only authorized users to perform encryption and decryption of critical business data wherever it may reside. Any unknown processes will be detected at the guard point and denied access before it can read or encrypt the data. In this way, application allowlisting serves to neutralize the malware because even if the malware can identify that the sensitive data exists, it will be blocked from being able to encrypt the underlying data. If the encrypted data is then stolen, it no longer holds value to the intruder since it cannot be used to expose confidential information. 

Guardium Data Encryption also incorporates fine-grained, policy-based access controls that define which users have access to specific protected files, applications and the corresponding activity the user can perform. Applying these policies across the network helps ensure that malware cannot exploit inconsistent privileges.

Enforcing granularity also comes with an improvement in governance capabilities due to role-based access controls making separation of duties more clearly defined and simpler to audit. Guardium Data Encryption’s granular access controls go beyond just user identity and the activity they are requesting to perform. As any strong encryption tool should, Guardium Data Encryption creates policies based on a wide collection of criteria, such as processes, time restraints, the type of data source being accessed and level of sensitivity.

The combined competencies of Guardium Data Encryption create a protective "checks and balances" system so the defense mechanics can react when other controls may fail. Managing all these activities may cause concern for the amount of overhead and resources an organization must dedicate to keep them on task. However, tools like Guardium Data Encryption are ideal for executing several defense activities simultaneously, such as deploying encryption methods, governing granular access controls and managing encryption keys from one central management console.

Administrators can create policies and quickly apply them across the enterprise, which helps to avoid security gaps and inconsistencies. With a strong focus on access controls at a granular level, our solution helps reduce the number of resources needed by being very particular about which users have access to what data and the associated processes, limiting the opportunities for unauthorized access or accidental changes.

Given that cybercriminals have access to advanced decryption tools now, it is imperative for organizations to implement a modern data encryption and key management tool. IBM Security Guardium Data Encryption is a highly scalable solution that offers organizations the capabilities needed to help protect their data and business from threats, such as ransomware attacks.  

3. Use AI-powered threat intel and analysis to generate high-fidelity alerts

When suspicious behavior triggers an alert, security analysts need to know whether it’s a random event or tied to known cyber-adversary tactics, techniques and procedures (TTPs). This means that security operations center (SOC) teams need to correlate analytics, threat intelligence and network and user behavior anomalies in real-time. SOC teams need a solution that can enrich security alerts with threat intelligence details and integrate threat intelligence with security controls to immediately block malicious domains, files, IP addresses, emails, etc. Operationalizing threat intelligence in this way acts as a backbone for a modern SOC.

Threat intelligence using TTP analysis enables teams to quickly determine if anomalous behavior within their environment is a part of a recognized cyber-adversary attack like ransomware. SOC teams need a solution that can enrich security alerts with threat intelligence details like malicious indicators of compromise (IOCs) or related attack patterns. Threat intelligence enrichment can also help organizations align individual alerts and events to malicious intent by mapping to the MITRE ATT&CK framework.

These are the capabilities that a leading SIEM solution like IBM Security QRadar SIEM can provide. QRadar SIEM includes over 500 correlation use cases, including several related to ransomware. When threat actors trigger multiple detection analytics, move across the network or change their behaviors, it can track each tactic and technique being used. More importantly, it correlates, tracks and identifies related activities throughout a kill chain, with a single high-fidelity alert automatically prioritized for the team so that they can respond seamlessly with their SOAR solution, such as IBM Security QRadar SOAR. Another IBM blog post covers much of what a leading SIEM solution can do.

Conclusion

As one of the most common and devastating attack threat action objectives today, ransomware continues to pose a major security risk for organizations to protect against. Given the average speed at which a ransomware attack is deployed today, SOC teams must rely on a host of solutions to prevent and minimize the many different attack techniques, actions and impacts that can lead to or result in ransomware. Some of the most important solutions include the following:

  1. Endpoint protection: Help prevent ransomware with early detection and minimize business disruptions with faster remediation.
  2. Data encryption: Better protect your sensitive data from being compromised.
  3. Threat intel and analysis: Help security analysts stay focused on investigating and remediating the right threats using AI-powered threat intel and analysis.
Get started with IBM Security solutions

Read the full IBM Security X-Force Threat Intelligence Index 2023.

John Le

Program Director, IBM Security Product Marketing

=======================

Accelerating Digital Finance for Women in the Next Era of FinTech Cloud

2 min read

By:

Briana Frank, Vice President of Product and Design

How Finmarie and IBM are working together to help women control their financial futures and manage their financial well-being.

We have seen the meteoric rise of FinTech within the financial services ecosystem not only strike gold with investors but also with consumers looking for innovative digital solutions to better manage their financial well-being. As the financial services industry embraces this new era, fintechs are in the driver's seat—helping create a roadmap that can provide greater financial inclusion and literacy at scale for communities across the globe. 

We believe this next era of FinTech is at the helm of addressing barriers to globally help usher in deeper economic independence for women, especially those who are entrepreneurs. Now is the time for fintechs to help make this possible by delivering access to hyper-personalized tools that can help women assess financial risks and take control of their financial futures. However, it is important for fintechs to remember that these activities require them to collect, store and manage their customers’ most confidential data. To help retain customer trust, fintechs should look to technologies that can help them keep data protected.   

What is Finmarie?

Finmarie, a female-founded fintech providing financial advisory services, saw opportunities for the financial services landscape to help unlock its potential in the number of female clients it serves and the diversity and quality of the financial services offered to them. Founded in Germany in 2018, the fintech quickly scaled its business from personal financial coaching to developing a consumer finance app that allows women to take control of their finances—from setting up pensions to selecting the right asset classes for their investment strategies. 

For Finmarie, data protection and customer trust are at the heart of their business. As they help women across Europe take control of their financial futures and manage their financial well-being, it is imperative for the company to keep clients’ highly sensitive and confidential data protected. Finmarie is striving to do this by leveraging confidential computing capabilities from IBM to help protect data.

IBM + Finmarie

As part of IBM’s Hyper Protect Accelerator Program (which includes a portfolio of 100 startups from 23 countries, with 49% of startups having at least one female founder), Finmarie utilizes IBM Cloud Hyper Protect Services. Hyper Protect Services features Keep Your Own Key encryption capabilities to help them keep data secured and in compliance with their regulatory requirements. 

Finmarie recognizes that wealth accumulation for women requires re-engineering access to financial products and services to expand financial inclusion and literacy. As they continue the mission to empower women to achieve both their personal and professional financial goals, Finmarie has pledged to put data trust and privacy at the heart of their operations as they work with IBM to leverage innovative security capabilities.

Briana Frank

Vice President of Product and Design

=======================

Turn Your Container Into a Trusted Cloud Identity Cloud

4 min read

By:

Henrik Loeser, Technical Offering Manager / Developer Advocate

Learn how compute resources like your deployed, containerized app can be turned into a powerful tool with attached IAM privileges, thanks to trusted profiles.

Over the years, I have learned to use API keys in my automations for IBM Cloud. API keys for user IDs and service IDs allow you to log in and perform access-restricted, protected actions. Wouldn’t it be nice to deploy apps without the hassle of securely distributing and managing API keys for them? You can already do this today, thanks to trusted profiles and compute resources.

For this blog post, I took a look at them and wrote some code to see trusted profiles with compute resources in action. Read on to learn about my journey.

Overview

Identity and Access Management (IAM) controls access to resources. IBM Cloud uses the concept of IAM IDs to abstract from users and other identities. It also has service IDs, which are identities that can be seen as “technical users.” They can be used by cloud services or applications to perform tasks. Similar to regular user IDs, service IDs can create and own API keys. The latter is used to authenticate and turn them into IAM access tokens.

A newer concept is the trusted profile—another type of IAM ID. Similar to the other IAM identity types, trusted profiles are treated as a subject that is granted access in IAM policies. However, users of trusted profiles do not need to be members of the account. They can be brought in with an identity provider via federation or use an identified compute resource. Currently, the latter can be a virtual server instance in a virtual private cloud, or apps and services deployed to an IBM Cloud Kubernetes Service or Red Hat OpenShift on IBM Cloud cluster.

Using a trusted profile with a compute resource, you could run a containerized app in a Kubernetes cluster, let that app request to use the privileges granted to that profile, and perform protected administrative tasks. All that would be possible without creating any service ID, sharing API keys, etc. Too good to be true? I put that concept into action:

Activity Tracker log record for a compute resource obtaining an IAM access token.

Trusted profile with a compute resource in action

IBM Cloud Kubernetes Service is one of the supported compute resources for a trusted profile and it offers a free cluster, which is great for testing my scenario. The steps to obtain an IAM access token through a compute resource are described as part of the trusted profile documentation and with more details for IBM Cloud Kubernetes Service clusters in “Authorizing pods in your cluster to IBM Cloud services with IAM trusted profiles.”

Basically, I need to perform the following steps. First, create a trusted profile. Then, add a compute resource for the trusted profile and either allow all IBM Cloud Kubernetes Service clusters or identify a specific resource by providing the cluster identity, Kubernetes namespace and service account. Next, I grant privileges to the trusted profile by adding it as member to access groups or directly configure access for the trusted profile.

With the trusted profile in place, the deployed app does the following:

  • Read the service account token.
  • Use the service account token with the name of the trusted profile to request the IAM access token.
  • Perform the IAM-protected tasks.

For testing, the above steps can also be performed by providing a sample job or logging into the shell of the running container and manually issuing the necessary commands from the shell. I tested both options, then combined everything for simplicity and as the foundation for an actual administrative app.

The small Python app offers two API functions. The first verifies it is running appropriately, and the second API function accepts a trusted profile name as a parameter and tries to read the service account token, turn it into an IAM access token, then list resources in the account. All are combined with additional debug/educational output.

For the tests, I deployed the app to different namespaces in my IBM Cloud Kubernetes Service cluster. Then, I configured matching and non-matching compute resources for the trusted profile. Next, I ran tests like the ones shown in the screenshot below. After getting into the shell of the running container, I used curl to kick off different authorization flows. Depending on whether a trusted profile exists, there are different error messages when access is denied:

Testing how to authorize an app to perform IAM-protected actions.

The last invocation is with the trusted profile and matching compute resource configured and is successful, returning a list of resources in the account and other debug output. The screenshot in the previous section shows an Activity Tracker log record of a compute resource login (“iam-identity.computeresource-token.login”).

As shown in my tests, getting from the app requesting the IAM access token to successfully receiving it involves the checks and possible error messages as depicted in the following diagram:

Requesting an IAM access token.

Conclusions

Trusted profiles are a type of IAM identity, and similar to other identities, they can have access privileges attached directly and can be members of IAM access groups. A difference is that the identity of trusted profiles can be assumed (i.e., users, apps or processes can operate under the identity of a trusted profile). One such way—which I blogged about in “Secure Onboarding for Your Workshops and Hackathons”—is through identity providers (e.g., App ID).

Another option to assume the identity of a trusted profile is through compute resources. In this blog, I showed that no API key or password needs to be made available to perform IAM-protected actions. Everything needed was just to specify as compute resource from where the app tries to obtain the IAM access token. This simplifies the process and, often, enhances security. As discussed and shown in this blog, my app itself, deployed in the designated namespace, serves as “turnkey” to be able to perform the work.

If you want to learn more about trusted profiles with compute resources, you can use my sample code as starter. If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik), Mastodon (@data_henrik@mastodon.social) or LinkedIn.

Henrik Loeser

Technical Offering Manager / Developer Advocate

=======================

Rethinking Recruitment Automation

5 min read

By:

IBM Cloud Team, IBM Cloud

Three talent acquisition experts offer advice on using intelligent automation to find (and keep) top talent.

“A mentor of mine said, why don’t you use your initials, which are conveniently A.L. So, I would submit my resume as Al Hood and people thought I was a man. And I would get interviews.” — Angela Hood, founder and CEO of ThisWay Global

In an interview on Smart Talks with IBM with Malcolm Gladwell, Angela Hood discusses how intelligent automation technology can make it easier than ever to connect with the best candidates for a more creative and competitive workforce.

The unique nature of today’s job market—with nearly two available jobs for every person looking but not enough job seekers with the skills required to do the work—is driving employers to adopt new strategies for acquiring talent. One of those strategies is looking to passive talent (i.e., candidates not actively seeking a new position) to build a relationship with people that have the required skill sets. Hood advises that when you talk to these individuals, you need to be able to say two things:

  • We use the best technology to identify you because you and your skills are unique, and we want you to come work for us.
  • When you get here, we’re going help you automate the parts of your job that you’ve never really enjoyed before, because we want you to dig into the areas you’re passionate about.
Rethinking the recruitment process

Jennifer Carpenter, vice president of talent acquisition and executive search at IBM, recognizes that there’s never been a better time to rethink the recruitment process—a process where hiring managers and recruiters are mired in administrative tasks while the search for exceptional talent becomes increasingly competitive.

In “Let's rethink recruiting,” Carpenter and Wendy Wick, a global talent acquisition expert at IBM, address the question: If you were to design the recruitment process for a new company, what would you automate to help HR acquire top talent?

Here are four best practices for improving recruitment with the help of intelligent automation:

  1. Shorten the application process: “The first thing I would do is rethink why I even need a resume,” Carpenter says. Most organizations collect applications via online portals where candidates are required to input a lot of information. “That can take as much as five minutes—and in today’s marketplace, those five minutes will result in too few applications,” says Wick. The improved approach, especially for jobs that are paid hourly, is to forgo a lengthy registration process and shift to what Wick calls a “soft application.” This could include asking candidates to enter their name, contact information and answer up to five custom questions, all via text message.
  2. Ask only what you need to know: “I believe in conversation starters, not deal breakers. By asking for the right information from the candidate at the right time—and keeping the experience light—you treat them more respectfully,” says Carpenter. Asking candidates to provide relevant and not excessive information shows you value their time. Wick shares this best practice example: Candidates for an hourly warehouse position are asked whether they can lift 40 pounds and if they’re willing to work nights and weekends. They fill out the application in less than 90 seconds and, with the help of automation, are directed to schedule an interview or talk to the right recruiter for follow-up. Wick says that when this effort is applied to 50% of an organization’s job openings, the recruiting cycle can be shortened by as much as 21 days for hourly wage jobs.
  3. Approach repetitive tasks like a retailer: Carpenter says, “If I can go on my phone and book a hair appointment, I’m expecting to engage with organizations as easily.” Scheduling a job interview can be frustrating for all parties involved. Organizations can look to their favorite online booking experiences for inspiration and develop a similar experience for contacting candidates and scheduling interviews.
  4. Fully understand your current state: “Often, we think we need to automate something because there’s a lot of noise around it,” Carpenter says. “What you initially thought you’d automate might change as you observe how your team is actually operating.” For example, recruiters rarely re-engage candidates who don’t get hired, and according to Wick, “62% of hourly job seekers don’t hear back after applying for a position.” These are missed opportunities to re-recruit individuals who have already expressed interest.
Transforming the recruitment process with intelligent automation

Once an organization can fully observe its current state, it’s easier to identify which intelligent automation solutions can reduce administrative burdens like gathering application data or scheduling interviews. As Wick says, “Imagine having a daily digest sent to recruiters with a rundown of that day’s scheduled interviews and other top priorities. That immediately helps with prioritization.” Intelligent automation solutions, such as digital workers, are being employed to automate dull and repetitive tasks, giving recruiters more time to do the job they were hired to do: act as advisors to the business.

“Digital workers, also known as digital employees, are software-based labor that can independently execute meaningful parts of complex, end-to-end processes using a range of skills. The digital worker operates side-by-side with the human employee, performing tasks like collecting documentation from customers or generating reports.” — The Business Leader’s Guide to Digital Worker Technology for Improving Productivity

Get a digital worker

ThisWay Global’s partnership with IBM includes the use of IBM Watson Orchestrate, a digital worker that’s enabling her company to simplify her customers’ hiring process.

Hood likens Watson to a “concierge” and describes using it like this: “You can have all of your job descriptions living inside of Box. And you’re like, ‘Oh, I need to find someone for this job.’ Watson goes into Box, grabs the job description, sends that into ThisWay’s system, and ThisWay automatically surfaces it up to 300 qualified people from diverse organizations. Then Watson automatically sends out communication to the candidates you’re interested in.”

With the help of IBM Watson Orchestrate, recruiters don’t have to spend time figuring out from where to source candidates and how they’re going to reach out to diverse organizations.  A process that typically took three weeks has been reduced to roughly three to four minutes, giving recruiters the time to do what they want to do—connect with people.

One of the unexpected benefits of the technology is how it opened communication between hiring managers and recruiters inside the same company, in effect un-siloing the work.

Hood describes the level of innovation in intelligent automation technology as “absolutely incredible,” and encourages companies to try something—be a part of it. “We’re going to see massive innovation over the next 5 to 10 years… and you don’t want to miss that. You don’t want to say, ‘Oh, I sat on the sidelines because I had a bad experience a decade ago.’”

Companies need to find talent. Talent wants to be found. Now the technology is here to help make recruiting more efficient and effective for everyone. “It’s time to delightfully disrupt this space,” says Carpenter.

Listen to Jennifer Carpenter answer the question: “What if you were to design the recruitment process for a new company? What would you automate to help HR find (and get) the best talent, always?”

Listen to the full interview – Combatting Hiring Bias – with Angela Hood, Malcolm Gladwell and Jacob Goldstein

IBM Cloud Team

IBM Cloud

=======================

Solutioning at the Edge Cloud

5 min read

By:

Ashok Iyengar, Executive Cloud Architect
Doug Eppard, Cloud Integration Architect

Solutioning is a recent term that has entered the IT lexicon that means the process and method of producing an IT solution.

An IT solution is a set of related software programs and/or services that solves one or more business problems. IT vendors, service providers and value-added resellers (VARs) market software bundles or product offerings as part of the solution repertoire.

In this blog post, we will look at solutioning from an IT architect’s perspective—more specifically, what it means to come up with a solution in the edge computing domain. Whether it is working with a business to define requirements, working with business partners to frame opportunities, managing vendors to deliver the solution or working with internal staff to use the solution effectively, IT projects can be complex. We will also take a brief look at complex solution architecture.

Please make sure to check out all the installments in this series of blog posts on edge computing:

The solution architect’s role in solution design

End-to-end solution design involves creating a solution architecture containing artifacts, including assumptions, diagrams (context, networking and data flow), and a service management and responsibility RACI matrix. The end goal is to meet customer requirements. 

The role of a solution architect (SA) is to create a comprehensive, end-to-end architecture to solve a particular business problem and associated set of requirements. The SA should demonstrate how the solution fits into the existing enterprise architecture (which typically comprises technology and business layers) along with a set of well-defined principles and strategies typically established by an organization’s enterprise architecture. Solution architects design and describe the solution, and they ensure the solution addresses service management aspects like operations and maintenance. Most importantly, they provide the linkage between the business problem and requirements to the technology solution.

Solution architects use diagramming tools, design templates, business models, sequence diagrams and architecture frameworks to craft the relevant solution architecture artifacts that address functional and non-functional requirements (NFRs). Figure 1 shows one such solutioning framework, which is a matrix of domains and aspects:

Figure 1: Domains and aspects in an architecture framework.

There are two key factors that cannot be overlooked: business and technology. Technology is the enabler for desired business capabilities. Once the technology components in the various domains are identified, it is an exercise in making sure they are well integrated and the solution can be delivered in an acceptable amount of time.

Edge computing as a solution domain

In previous blog posts, we have described edge computing as the discipline that brings the power of the data center out to devices at the edge. That could be the network edge, enterprise edge or the far edge (see Figure 2):

Figure 2: The edges in edge computing.

Each edge offers its own set of challenges, from the type of compute to network latency to the amount of available storage. A network edge-related solution for a telecommunications company may not encompass other edges, whereas a camera-based far edge retail solution could encompass all three edges.

Solutioning in edge computing

When it comes to solutioning at the edge, solution architects must think small—as in small footprint, limited-to-no storage, disconnected mode, etc. Examples of devices could include the Intel NUC, Nvidia Jetson, 1U computers, robots, mobile phones, smart cameras, cars and even programmable sensors. Often these devices cannot run Red Hat OpenShift, so one might have to consider the use of Red Hat MicroShift, Single Node OpenShift (SNO), K3S or RHEL for Edge with podman.

As an example, let’s look at the creation of a private 5G solution for a customer. This would fall into the “modernization” business category. In Figure 3 below, all the domain cells highlighted in light green would be in play. This provides a quick view of the domains in scope and forces the solution architect to start thinking of the various interactions and architectural decisions to be made (or components to be chosen), which correspond to those solution domains:

Figure 3: Domains in play for an edge-related solution.

Given this use case is 5G-related, there would be greater focus on the network edge. A 5G network essentially consists of three elements: a packet core, a radio access network (RAN) and user equipment (UE) or far edge devices. The next step would be to create a solution architecture diagram showing the infrastructure, network, security and integration details.

Key architectural decisions would have to be made regarding network function virtualization (NFV), whether to use VNFs (virtual network functions that are VM-based), or CNFs (cloud-native network functions) to run the networking functions and other platform resources like compute and storage. The transition from physical elements to VNFs and now to CNFs categorizes it as a modernization journey. Hyperscalers like IBM would have to decide which communication service provider (CSP) to partner with to provide the RAN. Finally, one must consider the placement, deployment and management of workloads on the far edge devices.

Based on the architecture decisions, the solution architect creates a software stack and a bill of materials (BOM) for the solution. For example, IBM Cloud Pak for Network Automation offers many of the necessary network functions in this solution. A product like IBM Edge Application Manager can perform the placement, deployment and management of workloads. A version of Red Hat OpenShift can run the containerized workloads. All these artifacts make up the solution package that is given to the implementation team. Refer to “Architectural Decisions at the Edge” for edge-related architecture decisions.

Complex solutioning as a discipline

With the rise of IT solutions that involve multiple clouds and hundreds of microservices that require significant interoperability, solution architects are forced to design solutions that meet stringent requirements and integration complexity. This has given rise to a new discipline called complex solution architecture and correspondingly the role of complex solution architect (CSA). A major facet of complex solutioning is risk assessment and risk mitigation.

What makes a solution complex?

A challenging client environment, the client’s digital transformation maturity, deployment in multiple geographies and languages, multiple products from multiple vendors, numerous integrations, security, compliance requirements, and FOAK (First of a Kind) implementations make for a complex solution. While iterating through solution design, it is imperative that solution architects come up with a solution that not only provides value to the customer but also fits the customer’s budget.

Wrapping up

Designing an edge solution that involves analysis of visual data captured by cameras within a private 5G network is complex. Design decisions come into play at every layer—from infrastructure to network to applications running in far edge devices.

Solution architects have many tools at their disposal, but an architecture framework helps guide the solution design process and ensures no aspect and related domains are overlooked when designing end-to-end solutions to meet customer requirements.

Let us know what you think.

Thanks to Joe Pearson and Jose Ocasio for providing their thoughts and reviewing the article.

Please make sure to check out all the installments in this series of blog posts on edge computing:

Learn more Related articles Ashok Iyengar

Executive Cloud Architect

Doug Eppard

Cloud Integration Architect

=======================

VPC Custom Image Creation and Distribution Cloud

5 min read

By:

Powell Quiring, Offering Manager

How to create and distribute IBM Cloud Virtual Private Cloud (VPC) instances from custom images.

I use the IBM Cloud VPC virtual server instances (VSI) for some workloads. I have specific requirements for the operating system version, applications and data, and I use VPC custom images created to my exact specifications so that the compute instances are provisioned with content that meet my requirements.

Large projects with distributed development over multiple accounts typically may have corporate requirements for images. Hardened base images can be created centrally and required for all production workloads. The corporate base images can be provisioned directly or used as a starting point to derive department images.

VPC custom images can be created and distributed in a number of different ways to fit your needs. Below is a diagram capturing most of the details of such an endeavor. This blog post will drill down into subsections of this diagram and describe the details so you can apply them to your environment:

VPC custom images in the IBM Cloud.

Image basics

This diagram captures the basic usage of VPC custom images:

VPC regional images.

Images are the starting point for virtual server instances (VSIs), as represented by the dashed line. They contain the initial file system that will be used to populate the boot volume. They also contain a specification of the boot parameters.

Images are regional based and can be used to start an instance in any of the availability zones in the region. In the image above, eu-de is used to provision an instance in Zone 1. This means that an image in us-south can not be used in eu-de. A copy of the image must be created in each region.

IBM uses cloud-init, the industry standard multi-distribution method for cross-platform cloud instance initialization. Read Getting started with custom images to get an introduction to IBM images.

Creating custom images from IBM-provided stock images

The most straightforward way to create a custom image is to provision a VPC VSI with one of the IBM stock images. The running instance boot volume can be prepared by using the cloud-init user data or SSH. Follow the instructions to create an image from a volume:

Create custom image from a stock image.

There are tools like Packer that automate the steps of creating an instance, copying data, executing scripts to install software and creating an image. This blog post provides an example.

On-premises images

On-premises virtual machine image files can be exported to local storage as qcow2 or vhd files. These files can be uploaded to a IBM Cloud Object Storage (COS) bucket. A VPC image can be imported from the bucket object. Make sure the requirements specified in Creating a Linux custom image or Creating a Windows custom image are satisfied:

Import an on-premises image.

It is also possible to export IBM VPC custom images to a COS bucket and import them as VPC custom images in a different region or download them for use in your virtual environment. You can try this using the desktop QEMU work flow. The core tutorial with QEMU is a good starting point.

Most Linux distros supply “cloud images.” Search for “distro cloud images” like this and you will likely find them:

Linux cloud image.

Example of ubuntu 20.04 (jammy) current:

Navigate to a qcow2 file like this one on the Ubuntu site.

This is a qcow2 file, and you can download this to your laptop and verify the checksum. Change the name to jammy-server-cloudimg-amd64.qcow2.

Upload to a COS bucket and import it in the IBM Cloud Console VPC custom images dialog by clicking on the Create button and selecting the Image source of Cloud Object Storage.

The information message will explain how to authorize access if the bucket is not visible:

 Select image source of Cloud Object Storage.

The authorization will look something like this:

IAM Authorization like this one is a prerequisite.

Back in the image import create dialog:

Import the qcow2 image from the bucket.

When the VPC custom image creation completes, you can use the Cloud console to create a VPC VSI with the custom image:

Operating system selection in the VPC Virtual Server Instance create dialog.

Distributing images across accounts using the private catalog

There is a private catalog product type specifically for VPC custom images. A catalog product version contains a list of VPC regional images. When a VPC VSI is provisioned from a catalog product version, the appropriate regional image will be used. Terraform, Packer and CLI accept catalog product versions in addition to images when creating VSIs:

Private catalog for product x with two versions.

In the diagram, note the three steps:

  1. Create a private catalog, product x and version. The version will contain a list of identical images—each in a different region.
  2. Provision an instance specifying the private catalog, product x and one of the versions.
  3. The VSI will be started with the image in the local region.

Private catalog products can be shared across accounts in an enterprise. This makes private catalog products ideal for distributing hardened corporate images across an enterprise.

To create a private catalog product and version, open the catalog in the IBM Cloud console and click Catalog settings:

Choose Catalog settings to manage private catalogs.

  1. Select Private catalogs on the left and click Create to begin the wizard.
  2. Add a Software product and a Delivery method of Virtual server image for VPC.
  3. Choose the jammy-server-cloudimg-amd64 that you imported earlier with Software version 1.0.1 to match the diagrams.
  4. Choose a Category of Compute / Virtual Machines:

    Create a product and the first version.

To reference multiple identical images in more than one region, cancel this dialog and import the image in the desired regions:

  1. To continue, click Add product.
  2. Click on the version and walk through the wizard for the version.
  3. You will need to validate the image by providing a VPC, SSH key and subnet. This step will create a IBM Cloud Schematics workspace to create an instance with the provided image and will take a few minutes.
  4. Click the refresh arrow to see progress.

No other data is required. Click Next to continue through all of the steps. The final step for a version is the following:

Finally, use the IBM Cloud Console to create a new VSI for VPC. In the Operating system section, choose a Catalog image for Image type and then the version you just created:

Sharing products with users in your account, enterprise, or account groups is also possible.

Conclusions

Using VPC custom images when creating VSIs can save time during provisioning. It can also ensure that the instance is initialized to meet application and corporate requirements. VSIs are provisioned from regional images. Private catalog product versions are a single ID for a collection of identical images distributed across regions to insure consistency. Catalog products can also be shared across enterprise accounts.

Learn more about IBM Cloud VPC.

If you have feedback, suggestions or questions about this post, please email me or reach out to me on Mastodon (@powellquiring@mastodon.social), LinkedIn or Twitter (@powellquiring).

Powell Quiring

Offering Manager

=======================

Disaster Recovery Strategy with Zerto on IBM Cloud Cloud

4 min read

By:

Debbie Van Zyl, Sr. Product Manager, VMware Resiliency Solutions
Bob Floyd, Global Alliance Architect, Zerto

Zerto on IBM Cloud is a disaster recovery solution for any size or complexity of workload.

Traditional disaster recovery used to be running a second site—typically with your old infrastructure that was in a basement or old building and tested once or twice a year by recovering from external media. This took days, was fraught with hardware outages, required excessive staff and meant that mountains of documentation needed to be updated and configurations reworked to keep up with production.

Today, things are different. With the cloud, we have the opportunity to relocate disaster recovery into an environment that is technically matched with production, flexible and, most importantly, has the uptime needed to ensure that the landing zone is stable and ready in the event of a disaster.

IBM Cloud for VMware Solutions and Zerto offer Zerto on IBM Cloud, a disaster recovery solution for any size or complexity of workload. Ensuring data is continuously replicated and compute is ready to go when you need it makes for a cost-effective and flexible solution that reduces business risk and complexity.

Eliminate data loss and downtime with Zerto

Here's how Zerto is helping enterprises continuously protect, move and recover enterprise applications:

  • Purpose-built for disaster recovery: Zerto was purpose-built for disaster recovery, rather than being a Frankenstein mismatch of backup solutions twisted into a product marketed as disaster recovery.
  • Based on continuous data protection (CDP): The solution is based on continuous data protection, not a scheduled moment-in-time backup that might be hours or even days old when you need it. There are no snapshots to take and no physical media to worry about.
  • Protect the same workload up to three times: Being a software-only disaster recovery solution, Zerto allows you to protect the same workload up to three times on whatever infrastructure you choose. In short, if it’s a virtual workload running on a supported hypervisor, Zerto can continuously protect it.
  • Best-in-industry VM-level replication: Zerto uses the industry’s best VM-level replication engine that provides the lowest RPOs and RTOs in the business. In most cases, RPOs in the seconds and RTOs in the minutes.
Top three critical features of Zerto replication
  1. Journal: This is the heart of Zerto’s continuous data protection (CDP). Think of the Journal as something like a DVR for your TV. It is continuously recording so that if you miss something in your favourite show or want to see that big play from the football game again, you rewind to the exact moment in time you want to see and you are good to go. Zerto’s Journal is much the same. It lives on the replication target site, and it is continuously recording your data while that data is being created. When you experience a disaster event—be it something natural like severe weather interrupting service at your primary production site or something man-made like a ransomware event—you simply pick the moment in time immediately before the event, failover to that moment, and your production site is back up with only a short interruption and limited data loss.
  2. Logical grouping of VMs together into a Virtual Protection Group (VPG): This allows you to create a VPG that protects all the VMs involved in running an application. Zerto then treats those VMs as a single unit, ensuring Write Order Fidelity during replication so that no matter which Journal checkpoint you chose to failover to, every VM comes up at that exact same point in time. No hoping that all your snapshots are compatible and that your application works.
  3. The ability to do full failover tests: This feature is available with complete reports for audit and compliance with just a handful of mouse clicks. Just select the VPG or VPGs you want to failover, choose the checkpoint you want to test and Zerto will spin up the VMs in a bubble network with no impact to your production environment. This isn’t simulated either. These are full VMs with production data, allowing you to connect to them and confirm the integrity of the data or the application functionality. Zerto continues to run in the background during the failover process, so a test of your environment doesn’t mean a gap in your protection.

No more guessing about whether your DR strategy works or how long it will take to recover in an emergency. You will know. You can even go to the next level with the test functionality, spinning up workloads to do things like new development, testing that new development, patch testing for OS or application—all without impacting production but all with real data from production.

Pay only for what you use on IBM Cloud

Everyone needs offsite backup for disaster recovery these days but running a redundant data center for this purpose can get expensive very quickly. This is where IBM Cloud comes in.

IBM Cloud is built on VMware vSphere—the same thing you are probably running in your data center today. You know it and you trust it. While Zerto is hypervisor-agnostic, it was born on vSphere and in many ways still functions best there, which means you have all the same functionality in IBM Cloud that you have on-premises.

The only difference between using IBM Cloud for disaster recovery and having your own second data center is that you aren’t paying for a whole second data center.

Zerto and IBM Cloud work together to provide flexible, unified billing; in other words, you pay only for what you use, and you only pay IBM. Let’s say you order 400 licenses when you start your project, but work goes more slowly than anticipated and that first month you only protect 50 VMs—you only pay for 50. Maybe the project never expands past that. You never pay for more than that.

To make it more economical, Zerto replication is data only, so you are only paying for storage. There is no compute expense on the DR side until you need to spin VMs up to do a failover.

Zerto and IBM Cloud also function bi-directionally, so you can use IBM Cloud for your data replication and do failover—live or test—back to your production data center. If your production data center is unavailable or you need additional compute temporarily, you just failover right there in IBM Cloud.

These are only some of the great features that make Zerto the best DR solution on the market today and just a handful of advantages of using the combination of Zerto and IBM for your cloud DR strategy.

Learn more

Zerto on IBM Cloud

Read the technical docs

Debbie Van Zyl

Sr. Product Manager, VMware Resiliency Solutions

Bob Floyd

Global Alliance Architect, Zerto

=======================

DAOS on IBM Cloud VPC Storage

6 min read

By:

Greg Mewhinney, Senior Engineer, IBM Cloud Performance Engineering
Paul Mazzurana, Senior Engineer, IBM Cloud Performance Engineering

How to start building your own Distributed Asynchronous Object Storage (DAOS) compute cluster on IBM Cloud.

In this blog post, we will tell you about the addition of the Distributed Asynchronous Object Storage (DAOS) open-source object store system to our growing portfolio of highly automated and quick deploying IBM Cloud VPC-based high-performance computing infrastructure. We’ll start with an overview of the technology behind this system and move to the automation features, how to use it and how it performs.

This newly available Terraform/Ansible-based automation system exemplifies the paradigm shift that is taking place in the world of high-performance computing (HPC). Capabilities that until very recently could only be realized with a dedicated data center and the requisite investments in hardware and personnel (along with months of planning, deploying and testing) can now, like this DAOS system, be built and deployed in 30 minutes.

Distributed Asynchronous Object Storage (DAOS)

According to the DAOS architecture overview, “DAOS is an open-source, software-defined, scale-out object store that provides high bandwidth and high IOPS storage containers to applications and enables next-generation data-centric workflows combining simulation, data analytics and machine learning.”

Unlike the traditional storage stacks that were primarily designed for rotating media, DAOS is architected from the ground up to exploit new technologies and is extremely lightweight since it operates end-to-end (E2E) in the user space with full OS bypass. DAOS offers a shift away from an I/O model designed for block-based and high-latency storage to one that inherently supports fine-grained data access and unlocks the performance of the next-generation storage technologies.

VPC infrastructure

The automation described in this blog will build a DAOS storage system on the proven IBM Cloud VPC infrastructure. A DAOS system consists of multiple components working together. At the heart of DAOS are the storage servers that are built on IBM Cloud VPC bare metal servers. For our part, we have tested with up to four 48-core servers, but larger configurations are certainly possible. Each individual DAOS server offers the following: 

  • A 48 or 96 core configuration
  • Up to 768 GB of memory
  • A 100 Gb network interface
  • 8 or 16 NVMe devices with up to 50 TB of storage

The system can be built with as many compute nodes as desired using any IBM Cloud VPC profile. There are a wide range of profiles available that can accommodate any computing task that offer the following:

  • 2 - 200 vCPUs
  • 4 - 5600 GB of memory
  • Up to 80Gb of network bandwidth

Each cluster will also include one small instance that combines bastion and DAOS administration functions.

The automation script repository

DAOS is a true embodiment of the Infrastructure as Code ideal. Building your DAOS object store system on IBM Cloud begins with a public Github repository containing complete instructions and configuration files and the Terraform and Ansible scripts to build out a cluster to your specification. 

Cluster creation

Cluster creation using the DAOS IBM Cloud automation scripts is best discussed as implementing two phases. The first phase consists of executing a set of Terraform scripts to provision cloud resources. The first step under this first phase consists of filling out the Terraform Variables file(s) that specify the cluster attributes. Next, Terraform is set in motion with the apply command. From there, Terraform will proceed to do the following:

  • Set up SSH keys.
  • Provision cloud network resources.
  • Create security groups to control access to cluster resources.
  • Provision storage servers.
  • Provision compute clients.
  • Provision the admin/bastion node.

The time required to complete the above steps depends on the desired cluster size and can be estimated based on the Terraform Cloud resource provision times listed below.

The final step in the Terraform provision process creates the admin/bastion node. As part of that node’s provision process, its cloud-init function will be employed via user_data to automatically kick off the second phase of cluster configuration. The following are the major steps of the second phase:

  • Install Ansible.
  • Retrieve Ansible playbooks from a git repository.
  • Run Ansible playbooks to do the following:
    • Install DAOS server packages on the DAOS servers.
    • Install DAOS client packages on the compute clients.
    • Install DAOS admin packages on the DAOS admin instance.
    • Configure all of the above and start DAOS.

The above packages are retrieved from the official DAOS package repository. The time to complete the above steps can be estimated using the Ansible Playbook install and configure line.

Security

The automation scripts build a cluster that employs simple and effective security practices to get you started:

  • User-supplied SSH keys
  • A jump host [bastion]
  • Firewall with only the SSH port open and restricted to your specified CIDR
  • All nodes in the cluster can only be accessed from within the VPC

From there, it is expected that you will employ the rich set of tools supplied by the IBM Cloud and the DAOS storage system to tailor these default security measures to suit your security practices as you put your cluster into production.

Time required to create a DAOS cluster

The timings we will discuss in this section were measured on varying cluster configurations in real experiments and can be used as a guideline. As always, your results may vary to some degree.

Cluster creation times were tested for three different cluster sizes:

  • One storage node with four compute nodes (1x4)
  • Two storage with four compute (2x8)
  • Four storage with sixteen compute (4x16)

The total time to create a cluster ranged from 27 minutes for the 1x4 to 31 minutes for the 4x16, with the 2x8 falling predictably in between at just under 30 minutes. The creation time for the cluster is split between the time for Terraform to provision resources and the time for the Ansible scripts to configure the storage cluster. For the 4x16, the split is 18 minutes for Terraform and 13 minutes for Ansible. The time to destroy a cluster and return the resources to the cloud took between four minutes for the 1x4 and six minutes for the 4x16.

The total time to create a DAOS cluster is modest given the capabilities and features of the completed cluster. The provision times are kept down because many of the resources are provisioned concurrently using, in this case, Terraform’s default parallelism, which allows up to 10 simultaneous operations. This parallelism also explains why larger clusters require only small increases in total time.

DAOS storage performance preview

To give you a preview of the performance of DAOS on the IBM Cloud, we tested an internal development release of DAOS that contains performance features that will be available in the upcoming 2.4 version, which is expected to release in late Spring 2023. It should be considered a preview of things to come and a demonstration of what is possible.

Testing was done on a cluster with 4 storage nodes with 48 cores and 8 NVMe devices in each (bx2d-metal-96x384 profile). 16 compute nodes employing the cx2-16x32 profile (16 vCPU,32GB memory) were used.

For testing, we used the well-known IO500 benchmark employing this DAOS specific methodology and obtained the following results:

Individual IO500 test results

  • ior-easy-write:            38.687956 GiB/s         time: 346.296 seconds
  • mdtest-easy-write:      381.627554 kIOPS     time: 435.799 seconds
  • ior-hard-write:            7.424557 GiB/s           time: 391.666 seconds
  • mdtest-hard-write:      157.793821 kIOPS     time: 428.967 seconds
  • find:                            276.135728 kIOPS     time: 845.487 seconds
  • ior-easy-read:              32.216002 GiB/s         time: 415.812 seconds
  • mdtest-easy-stat:         236.416785 kIOPS     time: 702.827 seconds
  • ior-hard-read:              6.685864 GiB/s           time: 434.946 seconds
  • mdtest-hard-stat:         227.684156 kIOPS     time: 297.614 seconds
  • mdtest-easy-delete:     151.984559 kIOPS     time: 1094.312 seconds
  • mdtest-hard-read:        211.442229 kIOPS     time: 320.404 seconds
  • mdtest-hard-delete:1   33.868635 kIOPS       time: 526.874 seconds

 IO500 Score

  • Bandwidth: 15.771351 GiB/s
  • IOPS: 210.470718 kiops
  • TOTAL: 57.614299

From our point of view, these results are quite competitive. You can judge for yourself by viewing the IO500 results featured on the main page of the IO500 web site

Conclusion

Not long ago, distributed storage and large compute clusters were the province of the data center and were known for being slow to deploy, expensive and inflexible, which presented large challenges to even well-staffed and funded data centers. As we have shown in this blog, the advent of technologies like DAOS, IBM Cloud and Terraform are rapidly putting that behind us. They point to a future of quickly deployable, flexible, economical, and highly performant HPC. We hope you will consider making this journey with DAOS and IBM Cloud. Please visit the DAOS on IBM Cloud automation repository and see how easy it is to get started building your own DAOS compute cluster on IBM Cloud.

Greg Mewhinney

Senior Engineer, IBM Cloud Performance Engineering

Paul Mazzurana

Senior Engineer, IBM Cloud Performance Engineering

=======================

IBM Security Randori: Prevent App Exploitation and Ransomware by Minimizing Your Attack Surface Security

3 min read

By:

Sanara Marsh, Director, Product Marketing

How attack surface management can establish a strong first line of defense against exploitation of public-facing applications.

There have always been and always will be unknown risks with organizations’ external assets, but with today’s sizeable remote workforce and their cloud, distributed and SaaS-based environments, it is essential to have a firm understanding of the how many unknown and unmanaged assets organizations have. The IBM Security X-Force Threat Intelligence Index 2023 revealed that 26% of initial attack vectors involved the exploitation of public-facing applications (second only to phishing). Additionally, the report found that of all incidents remediated, the second highest action on objective for attackers was ransomware at 17%. 

Shadow IT—hardware or software deployed on the network without official administrative approval and/or oversight—poses a significant risk because these unmanaged, unknown assets are far more likely to contain vulnerabilities or be misconfigured, increasing the likelihood they will be targeted by an attacker. With shadow IT and web-based exploitation accounting for a growing share of ransomware attacks and one-third of all breaches, hardening and reducing an organization’s attack surface has become an essential tactic. One of the biggest challenges can be knowing where to start.

Get started with an attack surface management solution

As a critical first step, it is important to understand the size of your visibility gap. To do this, organizations need to conduct a gap analysis, comparing their list of known assets to those found by an attack surface management (ASM) solution and assessing the severity of the risk posed by shadow IT.

The focus here is not on the percentage of total assets found; no outside party will find all of your assets. Instead, organizations should focus more on the relative number of unknown assets discovered and the severity of the issues they contain. When done on an ongoing basis, this gap analysis can become a critical KPI that vulnerability management teams track and work to reduce over time. Identifying these assets will help uncover and minimize blind spots, misconfigurations and process failures with attack surface monitoring, vulnerability intelligence and risk management capabilities.

While conducting a gap analysis in the past was a time-consuming and expensive effort, a leading ASM solution like IBM Security Randori has made identifying gaps much faster and easier. Randori’s capabilities take more of an attacker’s perspective by using automated black-box discovery along with out-of-the-box integrations with leading asset management solutions, such as Axonius and Panaseer.

Conduct black-box reconnaissance

Some key steps used in black-box reconnaissance to conduct a gap analysis include the following:

  • Adversaries most often start with no internal knowledge of target systems and are usually limited to publicly available information. All assessment of vulnerabilities, configurations and setup are all done from outside the network. This approach is usually seeded with an email or domain from the organization and tasked with fleshing out the rest.
  • There are numerous resources on open-source intelligence (OSINT) collection that prescribe step-by-step instructions for conducting hostname enumeration, kicking off network scans or how to leverage certificate transparency logs.
  • Critical sources must include network registration, WHOIS lookups, hostname enumeration, certificate log investigation, direct scanning and interrogation of public threat-intelligence sources.
  • Artifacts gathered should include network and domain registration information, HTTP headers and banners, screenshots, SSL and TLS certificates, DNS records and enumerated software version and configuration (where possible).

Remember, the goal of any technical discovery is the identification of software, so any additional artifacts that will help identify, enumerate or access additional services are useful. In a future blog post, we’ll cover additional steps that are critical to prioritize and reduce attack surface exposures using an attacker’s perspective.

Learn more

To see how your organization can benefit from the IBM Security Randori platform by helping identify shadow IT, sign up for a free Attack Surface Review or visit our page.

Read the full IBM Security X-Force Threat Intelligence Index 2023 and check out the Security Intelligence article, "Backdoor Deployment and Ransomware: Top Threats Identified in X-Force Threat Intelligence Index 2023."

Sanara Marsh

Director, Product Marketing

=======================

Adding Instance Storage to an Existing VPC VSI Cloud

3 min read

By:

Dimitri Prosper, Product Manager

Step-by-step instructions on how to add instance storage to an existing VPC VSI.

The product and development team that delivers new features on IBM Cloud Virtual Private Cloud recently introduced a feature to create a virtual server instance (VSI) using an existing volume (i.e., a boot volume that is detached from a previously deployed VSI). This feature allows you to delete a VSI, but keep the boot volume with all its installed software and configuration for future reuse.  I became interested in this feature as a possible solution to a problem I encountered in one of my deployed solutions.

The problem

In a development environment, I deployed three (3) VSIs, installed and configured an application. The initial configuration made use of a VSI profile without instance storage. As I was testing the solution, it became apparent that I should have used a profile with instance storage to benefit from the low-latency and higher disk I/O for some I/O intensive processing of the application. Although the option does exist today to resize a VSI to a different profile, it is not currently possible to resize to a profile that includes instance storage.

An instance with a profile that does not include instance storage cannot be resized to a profile that does include instance storage (IBM Cloud docs).

Maybe, now, there is a way.

The solution

The recently introduced feature to create a VSI from an existing volume may just be what I need to be able to move these VSIs without reinstalling my application. I need a small test scenario to confirm I can execute the move without experiencing any data lost and that the installed software will continue to work as expected. I started by making some small changes to a Terraform template I previously used in the “Automate the Configuration of Instance Storage on a Linux-Based VSI” blog post. The changes were to create the VSI using a profile that does not contain the instance storage. The modifications to the repository can be found here. I followed these steps to check against my goal:

  1. Modified configuration file env-config-boot-volume.sh to override some of the default variables used by the Terraform template: 
    • TF_VAR_vpc_app_image_profile: I set the value to a profile that does not support instance storage.   
    • TF_VAR_boot_volume_auto_delete: I set the value to false to prevent the boot volume from getting deleted when the VSI is deleted:
  2. Ran terraform init and then terraform apply to deploy the environment. As was done previously, a small application is deployed to the VSI that periodically attempts to writes to the instance storage if it is available. However, nothing will get written during the initial deployment. 
  3. With everything created, I switched to using the VSIs for VPC console UI. From there, I deleted the newly created VSI. As expected, the boot volume was not deleted:
  4. I selected to create a new VSI. This time, using the UI, I chose to use an existing volume that contained the app and all its configuration:
  5. I selected to use a profile that contains the instance storage I needed for the application:
  6. I configured the remainder of the instance create page with critical items, such as configuring the Name, Reserved IP, VPC, Subnet, Security groups, SSH keys, etc. to match the previous configuration of the original VSI.
  7. Once the VSI was created, the service I had previously deployed as part of the Terraform template in Step 1 above to configure the instance storage took over (i.e. /usr/bin/instance-storage.sh). My app is now making use of the instance storage for its processing. Following this process enabled me to bypass one of the limitations I had previously encountered.
Things to note
  • Using this process, the virtual server instance ID and CRN will change. Keep that in mind if you rely on that information in any of your automation or for logging and monitoring.
  • This use case of using the boot from existing volume feature to change the profile is fully supported by the team that delivered this feature.
  • There are other interesting use cases for this feature (e.g., moving a VSI from one VPC to another or regaining SSH access to an instance after losing your private key). Consult the product documentation.
  • Instance storage is ephemeral storage—only use it for data that is transient.
Wrapping up

If you have feedback, suggestions, or questions on this post, please reach out to me on LinkedIn. You can also open GitHub issues on the related code samples for clarifications.

Dimitri Prosper

Product Manager

=======================

Winning the IT Availability Battle: IBM Proactive Support Explained Analytics Cloud

5 min read

By:

Bina Hallman, VP, IBM Systems TLS Support Services

Maintaining high availability is critical for certain systems, and contracting proactive support for those mission-critical systems is how organizations can mitigate the potential effects of downtime.

More than three-quarters of CIOs indicated in the 2022 Forrester Priorities Survey that improving IT reliability and resilience is a priority for them. [1] This is hardly a surprise, given the significant impact that downtime can have on organizations. In the IDC report “The Cost of Downtime in Datacenter Environments: Key Drivers and How Support Providers Can Help,” clients advised that the most significant results of downtime were the impact on end-user satisfaction (with the loss of customers due to systems or sites being down) coupled with the financial impacts on time, regulatory compliance and potential penalties. [2] 

In that same report, the IDC outlines the key factors impacting downtime, including the type of workload, industry, and system, and the degree of automation/autonomous IT operations. Another factor to consider is the proliferation of vendors in the data center. Sixty-two percent of the companies surveyed reported that multivendor environments caused more downtime issues than single source. [3]

So, what is the answer to maintaining high availability and reducing the impact and cost of downtime?

Support services directly impact downtime

According to the IDC, “Enterprises should prioritize IT support services by workload criticality, viewing them as an investment in preserving the business value of these systems by relying on vendors for optimized performance.” [4] The report also notes that enterprises surveyed are currently saving 290 hours of downtime with server, storage and networking support contracts; more explicitly, they are preventing 79 hours of unplanned downtime thanks to predictive and proactive support tools. [5] It seems that the more critical the workload, the more proactive support should be considered. 

In addition, the IDC MarketScape: Worldwide Support Services 2022 Vendor Assessment provides a view of the top benefits of support services [6]:

  • Improved hardware performance and overall hardware satisfaction
  • Faster incident resolution time
  • Easier incident resolution (less effort for IT staff)
  • Reduction in incidents due to proactive support services
  • Lower cost of operations across the hardware environment
  • Decrease in system downtime and crashes
Choosing a holistic support strategy

IBM believes these outcomes are achievable with a holistic support and services strategy that includes the following:

  • Predictive support analytics that provide ongoing insights to clients about preventive maintenance. For example, security and maintenance coverage alerts that will help identify product lifecycle exposures specific to IT systems, prevent outages across hybrid IT environments and avoid denials of support on expired contracts.
  • Proactive support that enables clients’ IT staff to focus less on the day-to-day maintenance of systems for things like firmware and code updates or coordination of problem determination and resolution. This type of support ensures priority problem resolution for mission-critical systems.
  • Optimization services that help clients overcome skills gaps, speed the adoption of new technologies, and get the most out of the features that their infrastructure provides.
  • Multivendor capabilities that provide a single point of contact over the data center server, storage, software and networking, enabling clients to improve problem determination and resolution in a hybrid environment.
Predictive analytics

With IBM proactive support, you get predictive analytics that are provided with IBM® Support Insights, IBM® Storage Insights or IBM® Call Home Connect (or all three). They deliver preventive maintenance insights like maintenance coverage, risk assessments and security alerts to help identify product lifecycle exposures specific to IT systems. They also help prevent outages across hybrid IT environments and avoid denials of support on expired contracts. If you have IBM Systems products and aren’t already using these precious resources, find out more about how to register your assets to enable predictive analytics for IBM Systems.

AIOps and support

IBM Systems products and support processes are increasingly relying on AIOps to provide predictive analytics and reduce the mean time to resolve issues (or even signal them before they occur). For example, a cloud-based rules engine predicts system conditions and proactively alerts clients of required action plans directly through the product UI to prevent the problem from occurring.  Automated analysis and debugging of dump data helps support experts resolve incidents and suggest the next actions to fix more quickly.

These are just a couple of examples of the many ways IBM is leveraging AIOps to streamline the support process and improve predictive alerts. We are deeply integrating and incorporating the strategic capabilities IBM Instana Observability (for monitoring and visibility), IBM Turbonomic (for optimization), IBM Cloud Pak for Watson AIOps (for AI-driven IT operational insights) and other third-party ITOps tools to ensure that our platforms continue to be at the forefront of reliability and maintainability. By increasingly integrating analytics and AI capabilities, IBM expects to deliver even more powerful operations support and continued industry leadership.

Proactive support options from IBM

There are a number of ways clients can access proactive support from IBM. For the newest IBM infrastructure products, IBM Expert Care (offered at the time of purchase) provides a tiered approach to support that enables clients to choose the level of support based on their needs and the mission-criticality of the system. The Premium tier of IBM Expert care includes enhanced response times, predictive alerts and a highly-skilled technical account manager (TAM).

The TAM reviews the entire IT environment and is the client focal for any issue, focusing on proactive actions to prevent issues from happening and assisting with problem resolution. With recommended proactive measures, IBM can help clients avoid unplanned downtime and maintain high reliability and availability of their systems. TAMs are different from traditional technical support specialists in that they develop long-term relationships with clients and are their organization’s advocates. Moreover, they have direct collaboration with IBM product development and engineering labs and can deliver enhanced services to meet business objectives

For older versions of IBM infrastructure and products that aren’t rolling out with IBM Expert Care, you can still opt for a more proactive approach with Premium Support Services for IBM Power Systems and IBM Storage and Proactive Support for IBM Z. These offerings are quite similar to the Premium tier of Expert Care and generally provide the same benefits.

The impact of multivendor support

It’s likely you also have infrastructure from other vendors in your data center. We saw above that this multivendor approach brings you more flexibility but increases your likelihood of downtime. We can also help clients more proactively manage their infrastructure holistically with IBM Support Services for Multivendor Server, Storage, Network and Security and support for Microsoft, Oracle, Red Hat and SUSE. Not only can leveraging multivendor support help to reduce your downtime, but it can also provide significant cost savings for many organizations.

Find out more today about the support options IBM Technology Lifecycle Services can provide to your organization to help you reduce downtime and reap additional benefits with proactive and multivendor support for your data center.

 

[1] Forrester Priorities Survey, 2022 Base: 144 Purchase influencers

[2] IDC Perspective: The Cost of Downtime in Datacenter Environments: Key Drivers and How Support Providers can Help, Doc # US50240823, March 2023

[3] IDC Perspective: The Cost of Downtime in Datacenter Environments: Key Drivers and How Support Providers can Help, Doc # US50240823, March 2023

[4] IDC Perspective: The Cost of Downtime in Datacenter Environments: Key Drivers and How Support Providers can Help, Doc # US50240823, March 2023

[5] IDC Perspective: The Cost of Downtime in Datacenter Environments: Key Drivers and How Support Providers can Help, Doc # US50240823, March 2023

[6] IDC MarketScape: Worldwide Support Services 2022 Vendor Assessment, Doc # #US48896919e, March 2022

Bina Hallman

VP, IBM Systems TLS Support Services

=======================

Tenets of IT Modernization for Business Agility, Speed and Productivity Cloud

11 min read

By:

Balakrishnan Sreenivasan, Distinguished Engineer
Siddhartha Sood, Executive Architect

Exploring an objective-driven approach to modernizing IT and the various tenets of such an endeavor.

Many organizations today are joining the move-to-cloud bandwagon, primarily to reduce technical debt and cost and to meet CapEx-to-OpEx objectives. The work involved includes lift-and-shift, replatforming, remediations and more. As various practices like DevOps, cloud-native, serverless and site reliability engineering (SRE) are maturing, the focus is shifting toward a significant level of automation, speed, agility and business alignment, with IT helping IT organizations transform to engineering organizations.

CIOs/CTOs are realizing that the real power of all this lies in modernizing their applications and services to a product-centric model, which is core to achieving an engineering-centric IT organization. This blog post talks about an objective-driven approach to modernizing IT and the various tenets of such an endeavor.

Why modernize?

Enterprises today have many reasons for transforming their workloads to cloud, including time to market, cost reduction, increased agility, improved resiliency and more. The diagram below depicts these objectives in a way (with the purple chevrons) that we could link them to specific actions enterprises should take (explained further in a later part of this blog):

As these cloud transformation initiatives evolve, many enterprises are realizing that the move to cloud is not giving them the desired value or agility/speed beyond basic platform-level automation. The real problem lies in how the IT is organized, which reflects in how their current applications/services are built and managed (see Conway’s law). This, in turn, leads to the following challenges:

  • Duplicative or overlapping capabilities offered by multiple IT systems/components create sticky dependencies and proliferations that impact productivity and speed to market.
  • Duplicative capabilities across applications and channels give rise to duplicative IT resources (e.g., skills and infrastructure)
  • Duplicative capabilities (including data) give rise to inconsistent customer experience.
  • A lack of product thinking impacts the evolution of business capabilities in alignment with business and market needs. In addition, enterprises end up building several band-aids and architectural layers to support new business initiatives and innovations.

Modernizing applications and services (including data) to a set of domain-aligned products and capabilities while restructuring application teams to product teams helps address many of the above challenges. The best way to realize this model is to leverage cloud-native technologies (microservices, serverless and event-driven architectural styles) that help transform applications from traditional monolith model to domain-aligned capability components (as-a-service based) and an end-state, zero- to low-touch operational model.

In this target state model, full stack squads build each of the application capabilities and services in a DevOps-driven way, with end-to-end automation that includes observability and operational automation. This eventually helps build an engineering-centric IT organization wherein the teams have complete ownership and accountability, and they have the highest degree of freedom to evolve their components with business.

Key benefits

Programs and initiatives with the objectives laid out above will help achieve certain benefits:

  • Improve productivity and speed to market with independent product teams that have minimal dependencies (outside service contracts).
  • Optimize IT resources (people and infra) by removing duplicative capabilities across applications and channels.
  • Remove dependencies on several shared services as product teams have full stack squads (so they have less dependencies).
  • Avoid customer satisfaction issues through functional data consistency.
  • Evolve business capabilities in alignment with business and market needs through product thinking. This also helps establish a foundational IT platform for driving business innovation.
Tenets of modernization: The big picture

Throughout this article, we examine each of these objectives with an IT modernization lens that focuses on agility, speed and productivity. Before we proceed, we are going to look at a holistic view of various actions that need to be performed to realize the objectives mentioned above. Then we will dissect various pieces of the puzzle. Here are a few pointers on how to read this big picture.

Read the legend and the colors—they describe the category of actions that need to be performed on or around the application or system to achieve the end state. Then follow the color legend across the flow from left to right as per the following points:

  • Objectives (purple): Lays out the objectives we have discussed above into areas that can best describe how an enterprise IT is modernized and transformed to a product-centric model based on cloud-native principles.
  • Application decomposition and modernization workstream (pink): Decomposes the application into components.
  • Target operating model—org change/transformation (red): Focuses on target operating model activities (squad structure, process, etc.)
  • Talent and skilling (blue): Focuses on enabling the full-stack squads on ways of working, technology, tooling etc.
  • SRE/DevSecOps practices (green): Focuses on building necessary operational aspects into the IT capability components to ensure zero-touch operations.

Sections below describe each of the objectives and how are they achieved across the journey from legacy application and team model to an end state.

Increasing agility and speed to market

Moving from monolith legacy applications supported by a layered/multi-team ecosystem impacts agility and speed. Modernization of the legacy applications to domain-aligned capabilities and services supported by empowered full-stack squads helps drive the necessary agility and speed. It helps remove redundant shared services via an automation-first approach:

Product-centric model

The product-centric model is about decomposing and rewriting existing monolith/legacy applications and services (including data) to a set of domain-aligned products that are supported by product teams (consisting of full-stack squads). Typically, industry-standard domain models (e.g., BIAN, IATA , TMForum, etc.) are customized to suit the enterprise. Domain models form the reference point for identifying products and decomposing application capabilities, and they also help restructure the IT organization to a product-oriented organization. Product-engineering practices are incorporated in various aspects of the lifecycle, from concept to the deployment and operation of IT capabilities.

Domain-driven-design-led modernization

Enterprises need a domain model that’s closer to how the business operates. Typically, it is established through a core set of domain SMEs (Domain Center of Excellence/Competence) that are aligned to the future-state vision (also referred to as Target Operating Model). This core COE/COE team builds the enterprise domain model (based on industry domain models) and helps with structuring the IT organization in a domain-centric approach where product teams own business capabilities within a domain.

The core team subsequently scales up domain-drive design (DDD) experts to help various project teams align to the domain models, DDD principles and practices. Architects and full-stack squads in each of the domain-led organizations should have skills to decompose applications to capabilities and subsequently map them to respective domain teams so that they can own and manage the capabilities in a cloud-native/NoOps-based model.

When moving towards product-centric IT, many clients under or overestimate the organizational change needed to achieve these objectives. Such initiatives may fail if you underestimate the complexity involved in decomposing and constructing applications capabilities along the lines of the identified domain-aligned products and services in the context of associated domains.

We have defined a framework that outlines a systematic, disciplined way of applying DDD principles across the enterprise to simplify the complexity. An executable, step-by-step plan needs to be established to address these challenges, and this three-part blog series dives deeper into this topic.

Rapidly move applications to cloud

Enterprises will have a spectrum of applications with different criticality levels and business value. Typically, modernization journeys are applied to high-value/high-criticality applications. An automation-first approach (inclusive of tooling) is a must to gain the necessary acceleration to modernize these applications to cloud.

Such tooling typically supports discovery, analysis, design and execution through a patterns-based approach that helps rapidly move (e.g., migrate or containerize) these applications to cloud. IBM’s cloud engineering—in partnership with many of our ecosystem patterns—provides multiple proven tools and assets that help during each lifecycle phase. Once in the cloud, these apps are expected to continually align and evolve with the overall product-centric model and be supported end-to-end by product teams who continue to modernize them on the cloud:

Continuous business alignment

A domain-centric approach to decomposing and rebuilding applications ensures that the capabilities that are being built align well with the business. Identifying products based on domain-centric principles ensures the necessary independence and ownership to the product teams. This, in turn, goes a long way in ensuring the continual evolution of the product capabilities and services aligned with business needs. 

Product teams own IT capabilities/services as part of the product scope (which also includes infrastructure services, applications and data). This is key to drive agility and speed while different products teams collaborate through service contracts that include SLAs, etc.

Value-stream mapping exercises help establish various value streams and associated business capabilities offered; this becomes a key input for performing an efficient DDD. A structured DDD approach ensures each of the capabilities are built or customized (in the case of third-party products) in a loosely coupled manner with clear data ownership.  

Transforming current state applications into a business value-stream based products and services model needs a systematic approach:

  • A top-down approach of establishing value streams and business capabilities mapped to each of the value streams.
  • A bottom-up approach of decomposing and mapping current applications to business capabilities and subsequently a set of domain-driven products and services.

Our initial experiences indicate that a mix of top-down and bottom-up approaches is the best as there is a significant amount of domain knowledge to be harvested from the current-state applications and the functionalities they offer:

Improve resiliency with low-touch/no-touch Hybrid cloud platform foundations

Product teams with full-stack squads require infrastructure and platform autonomy to succeed. This necessitates certain foundational platform capabilities (e.g., on-prem, cloud, etc.). These platform capabilities could be classified into core foundational services—landing zone, control tower, security capabilities, DevOps capabilities, multicloud capabilities, etc.—all aligned with a cloud target operating model (CTOM).

The entire foundational platform needs to consider the fact that during transformation, applications and services need a hybrid platform (where part of the components would be on-prem while the rest of them are being incrementally modernized to cloud). The platform also should consider the fact that the application or product teams should be able to consume all the services in an “as-a-service” model with extreme degree of automation via Infrastructure as Code, DevOps pipelines, etc.

Enterprise DevOps and SRE automation

Continuing with the autonomy theme, along with platform capabilities, DevOps is also essential to this endeavor, but can sometimes be plagued through legacy centralization models. Within a centralized model, a shared services team provides/manages all DevOps capabilities (e.g., tooling, pipeline, software supply chain KPIs, etc.). Such a centralized model has distinct advantages but often limits autonomy, flexibility and speed. Therefore, the foundation platform should offer the necessary tooling and services that full-stack squads can leverage to build, deploy, observe and manage their workloads.

It is critical to have Pipelines as Code and reference pipelines that not only provide just enough capabilities for teams to get started but also the flexibility to customize the pipelines based on specific workload needs. The key is to ensure that the right metrics and measures against enterprise KPIs are gathered and gated as part of pipeline execution to determine the overall product health and engineering rigor expected from product teams.

Transformation to Security as Code

Security organization and process need to embrace the cloud-native model and move towards automation. Security and compliance teams are often isolated and typically bring an outside-in perspective to product lifecycle. Security and compliance teams can be fully integrated to the product lifecycle by helping integrate proactive security and compliance validations into the product lifecycle.

This can be achieved by moving to a patterns-based design/develop/build/deploy/manage approach and by empowering teams to continuously validate their overall security posture by leveraging integrated security best practices. Various security and compliance policies are integrated into platform and application (via DevOps pipelines) as code, which helps establish necessary guardrails. Application or product teams integrate a suite of security tooling and services into their DevOps pipelines to bring out necessary transparency into security and compliance adherence, and this helps shift-left security model to application or product teams.

Product lifecycle acceleration

Enterprises often retrofit cloud transformation elements within existing application supply-chain processes rather than considering new application lifecycle and delivery models that are more suited for delivering applications to cloud. Those enterprises that can reimagine the application lifecycle through an automation-first approach help bring in the necessary product lifecycle acceleration that cloud transformation promises.

This typically requires that security, compliance, change management, operations, business continuity and business come together. It’s important to have a single view of the end-to-end application lifecycle from concept to deploy/manage in cloud, where automation driven transformation points can be identified.

Examples of such transformation points could include the following:

  • Codify and automate security and compliance requirements.
  • Implement a pattern-based solution definition approach to accelerate security, compliance and change-management processes  (patterns with embedded security implementation).
  • Re-use “patterns” as code.
  • Utilize DevOps pipeline-driven activities across the lifecycle.
  • Build traceability from security requirements to implementation.
  • Generate a high degree of data needed for governance, risk and compliance. Perform security and operational-readiness reviews with limited or no manual intervention.
Zero-touch operations model

Automated monitoring, insights, alerting and a suite of auto-healing/remediation capabilities are key to reducing manual touchpoints and achieving a zero-touch operations model. Full-stack squads and SREs build observability and day 2 operational management aspects into the capabilities and services with ‘no-manual-intervention’ as an objective. Enterprises need to focus on stressing the importance of an operations model that is process- and automation-dependent (rather than people dependent). This requires rigor, well-defined SRE, operational-readiness practices and collaboration:

Build engineering-focused teams through skill transformation

Cloud-native development expects full-stack squads to work in the following model: “You build, you own, you manage”

These full-stack squads are autonomous and own end-to-end progress, and this needs a foundational platform that truly implements various cloud services in a Platform-as-a-Service (PaaS) way for teams to build applications and services. There needs to be a systematic learning plan and framework based on personas to drive various aspects of learning (e.g., core programming skills, cloud services certifications, DevOps/SRE practices, DDD skills, etc.):

How does it all come together?

Let’s bring back the big picture, which groups the objectives we have discussed above into areas that can best describe how an enterprise IT is modernized and transformed to a product-centric model based on cloud-native principles. Don’t forget to look at the legend as described at the beginning:

The journey from objectives to target-operating model is a continuous endeavour. The time it takes to reach the target state would depend on enterprise maturity and consider people, processes and technology dimensions. With multiple workstreams catering to different enterprise strategic objectives, adding each objective creates significant complexity within the enterprise and requires a higher degree of effort across multiple teams and portfolios to get implemented and matured.

A Cloud Center of Competency (CCC) is a great acceleration mechanism to realize these strategic objectives within the enterprises. A center of competency is about the successful deployment of a new technology at the enterprise level. As an example, a CCC could help build codified reference architectures and patterns that integrate multiple cross-cutting concerns (e.g. non-functional requirements, operations, security, reliability, FinOps, compliance) into Infrastructure as Code (IaC). The IaC can then be leveraged by full-stack squads to deploy and manage end-to-end automated product capabilities to the cloud.

Conclusion

The evolution of cloud has opened a plethora of possibilities for various enterprises to exploit, and this makes the composable IT ecosystem a reality. The emergence of various proven practices like domain-driven design, DevOps, Infrastructure as Code (IaC) and Site Reliability Engineering (SRE) have made full-stack squads a reality. This enables the realization of independent product teams that can build end-to-end capabilities and services without layers of IT getting involved (as we have seen in traditional IT ecosystems).

Enterprises embarking on modernization initiatives to transform their IT ecosystem into a composable model need to recognize the quantum of change and operating-model transformation across the enterprise and think through this more pragmatically. They also need to recognize the fact that clarity on domains and processes will evolve with time, and there needs to be room for changes.

Enterprises need to adopt a multi-step approach get to the model, considering the above challenges. Initial steps should focus on identifying a smaller subset of products (or domains) to pilot, demonstrate successes or fail fast, and identify learnings that should be fed back to refine the journey, plans and operating model. Moving to a composable IT ecosystem is a long journey and measuring success at every intermediate change is key to long term sustainable success.

The role of the Cloud Center of Competency (CCC) is crucial. It brings the right level of interventions that accelerate the journey via building reference architectures, patterns as-code and guidance and help work across various organization teams to establish an integrated develop-deploy-operate lifecycle.

Key links Balakrishnan Sreenivasan

Distinguished Engineer

Siddhartha Sood

Executive Architect

=======================

Retrieve and Analyze Your Cloud Access Management Data Cloud Security

4 min read

By:

Henrik Loeser, Technical Offering Manager / Developer Advocate

IBM Cloud offers APIs to retrieve identity and access management data. In this post, we show how to analyze this data to improve your cloud security setup.

With an IBM Cloud account comes responsibility, including setting up and monitoring access management for your cloud resources. In the past, I have shared with you how to get insights into IBM Cloud account privileges or how to improve security by identifying inactive identities. In this blog post, we will give you an overview of existing APIs you can use to obtain identity and access management (IAM) and resource data. Then, we will show you how to analyze that security data. With these insights, you can improve security for your IBM Cloud account and its resources.

There are many ways to analyze access management data. We chose to retrieve the data and store it in a relational database. That way, we can easily combine data from multiple sources and run SQL queries, thereby generating security reports.

The source code for the discussed project is available on GitHub.

Overview of IBM Cloud APIs for platform services.

Overview: Access management data

If you have worked with IBM Cloud and looked into security and compliance before, you may have already used all the following sources to improve account security:

In addition to the above, there is data about the account, its resources, user and service IDs, and their privileges. In this post, we refer to that data as “access management data.” It can be viewed and retrieved in many ways, including through the IBM Cloud console (UI), the command line interface (CLI) and other interfaces. In the following, we focus on the Application Programming Interfaces (APIs) for the IBM Cloud platform services (see screenshot above). You can access their documentation by going to the API and SDK reference library and then selecting the Platform category.

For access management data, the important IBM Cloud APIs include the following:

There are more APIs available, but the above form the core. Data from these APIs helps establish a (mostly static) snapshot view of the security setup. It is similar to what (on a high level and ignoring details) the IBM Cloud Security and Compliance Center evaluates.

Each of the API functions requires an IAM access token and returns a set of JSON data. The real value comes from combining the data for the full picture—composing the puzzle from many pieces. It is the first step toward the security analysis. The data from all the APIs can be held shortly in memory (just for running some reports) or it can be persisted for deeper analysis. We opted for the latter and decomposed the JSON objects into relational tables. This means we can run SQL queries and benefit from its expressional power for the analysis.

It is important to note that the analysis does not cover any dynamic membership rules or context- or time-based access decisions. Deciding on access as part of IAM processing requires more dynamic data. We don’t want to and cannot mimic IAM decision. The analysis only helps to find interesting spots in the security setup to investigate and possibly to improve.

Retrieve and store

To build our base with access management data, we started by mapping the different JSON objects to relational tables. Some of the JSON objects have nested data. For example, when listing policies, the results include the policy metadata and policy subjects, roles and resource information. This leads to four policy-related tables in our data store. Similar mappings need to be performed for the other API results. It resulted in a database schema as depicted below:

Entity Relationship diagram for the database schema.

We decided to use Python to retrieve and store the data based on existing code from previous work. Depending on the API function, data retrieval could require paging through result sets. Often, the number of objects in a single result is limited to 100. Some API functions require additional parameters to be passed for enriched results. The latter contain additional details that are useful for security analysis.

We use SQLAlchemy—a database toolkit for Python—for access to the data store. It allows you to easily switch from, for example, SQLite to PostgreSQL or Db2 on Cloud as a backend database.

Analyze cloud access management

With the data store in place, it is time to analyze the cloud access management data. By combining data that is usually only available in different console pages or from multiple API calls/CLI commands, we can easily answer security-related questions like these:

  • Which cloud service instances are referenced in access policies but do not exist?
  • Which cloud service instances exist but are not used in any access group and their policies?
  • Which users (or service IDs or trusted profiles) are not a member of any access group?
  • Which access groups do not have any policies with Reader or Viewer roles?
  • Which access groups do not reference any region or resource group in their policies?

The SQL queries for the above questions can be run from a Python script in a Jupyter or Zeppelin notebook or other SQL clients. The screenshot below shows part of a simple text-based report produced by a simple Python script. The related SQL statement joins many of the tables in our data store:

Report generated on existing IBM Cloud IAM Access Groups.

Conclusions

The analysis of cloud access management data can enhance the security of your IBM Cloud account. It adds to the existing platform-provided tools that I mentioned in the overview. The generated reports serve as input to revisit existing access groups and may lead to the removal of some policies and access groups. As noted, it is a static analysis and does not consider context data and dynamic rules.

The tool and its code is available in a GitHub repository. If you have an interesting query to share or improvements to add, please feel free to open a pull request. Also, check out some of our other security blog posts:

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik), Mastodon (@data_henrik@mastodon.social) or LinkedIn.

Henrik Loeser

Technical Offering Manager / Developer Advocate

=======================

Upgrading IBM Cloud Databases for PostgreSQL with Minimal Downtime Database

3 min read

By:

James Thorne, IBM Cloud Databases
Daniel Pittner, STSM, Cloud Databases Security Architect

A new process is available via PostgreSQL logical replication that allows continued writes to the database during and after an upgrade.

IBM Cloud Databases for PostgreSQL offers two direct ways to perform a major version upgrade:

  • Provision a read replica and choose to perform an upgrade when promoting it.
  • Back up the database and restore it into a new instance (optionally performing a point-in-time restore).

Unfortunately, both processes involve a period of time during which writes to the database must be suspended to prevent data from being lost following the upgrade.

When upgrading from IBM Cloud Databases for PostgreSQL versions 10+, a new process is available via PostgreSQL logical replication that allows continued writes to the database during and after the upgrade, requiring only a momentary interruption while application(s) are reconfigured to write to the upgraded database instance. This post walks through how to execute the process and discusses some of the caveats and limitations associated with it.

Note: We recommend testing the procedure described below in a non-production environment first to get familiar with it and identify any issues that may occur before attempting the upgrade against any production database instances.

The upgrade process

First, you’ll need to prepare the original database instance to be upgraded:

  • Enable logical replication as described in the IBM Cloud Databases for PostgreSQL wal2json documentation up to Step 2 (complete the wal_* configuration and set a password for the repl user):
    • Note that Step 3 isn’t supported on IBM Cloud Databases for PostgreSQL version 10, but it isn’t needed to complete the migration.
  • Grant the replication (repl) user permission to read all tables you want to migrate using GRANT SELECT {…} TO repl;:
    • You can grant access to all tables in a schema you wish to migrate with GRANT SELECT ON ALL TABLES IN SCHEMA {schema} TO repl;, filling in the {schema} name as appropriate.
    • The GRANT command is described in more detail in the PostgreSQL documentation.
  • Collect the hostname and port of the source instance from the Endpoints > PostgreSQL panel on the Overview tab of the database console or by using the ibmcloud cdb deployment-connections CLI.

Next, to perform the upgrade:

  • Create a new IBM Cloud Databases for PostgreSQL instance at the target version and load all tables via DDL:
    • You can use pg_dump --schema-only/pg_restore to migrate the DDL.
    • pg_dump is described in more detail in the PostgreSQL documentation.
  • Create publication(s) on the original database instance for the table(s) you wish to migrate using CREATE PUBLICATION {schema}_migration FOR TABLE {table}, {table}, {...};, filling in the {table} names and {schema} as needed.
  • Create a subscription on the target database instance using SELECT create_subscription('{schema}_subscription', '{hostname}', '{port}', '{password}', 'repl', 'ibmclouddb', '{schema}_migration');, filling in the fields as needed.
  • From the original database, watch the target database replicate data using SELECT slot_name, confirmed_flush_lsn, pg_current_wal_lsn(), (pg_current_wal_lsn() - confirmed_flush_lsn) AS lsn_distance FROM pg_replication_slots;:
    • If data isn’t replicating, check the logs of both the original and target databases via the IBM Cloud Log Analysis integration for possible issues.
    • If the replication slot no longer appears, it may have been interrupted by maintenance. See “Caveats and limitations” below.
  • Once the target has caught up (and the lsn_distance has reduced to zero), reconfigure your application(s) to begin writes to the target instance rather than the source.

After verifying that the upgrade completed successfully, clean up as follows:

  1. Remove the subscription(s) created above using SELECT delete_subscription(‘{schema}_subscription’, ‘ibmclouddb’);.
  2. Delete the source database instance.
Caveats and limitations

Regularly scheduled maintenance performed by IBM Cloud Databases may impact the migration process and require it to be restarted. In that case, there should be no impact to running application(s) as the original database instance will continue to operate normally until the very end of the procedure.

The migration process has the same limitations as PostgreSQL logical replication, including the following:

  • Schema and DDL commands are not replicated. Accordingly, the schema must be created manually on the target instance above, and schema changes performed to the original database instance after the target instance begins replication may cause the replication to fail.
  • Sequences are not replicated.
  • TRUNCATE actions involving tables that are not included in the subscription may fail.
  • Large objects are not replicated.
  • Only tables can be replicated; views, materialized views and foreign tables must be migrated separately.
Learn more

See the PostgreSQL logical replication documentation for more details.

Get started with IBM Cloud Databases for PostgreSQL.

James Thorne

IBM Cloud Databases

Daniel Pittner

STSM, Cloud Databases Security Architect

=======================

Restrict Access to Your IBM Cloud Resources Using Context-Based Restrictions Cloud

4 min read

By:

Josephine Justin, Architect
Norton Samuel Stanley, Lead Software Engineer - IBM Cloud Storage

Context-based restrictions (CBRs) in IBM Cloud can be used to enhance security and protect sensitive data.

By leveraging these features, organizations can ensure that their cloud resources are accessed only by authorized users and comply with various security and regulatory requirements.

What are context-based restrictions?

IBM Cloud context-based restrictions (CBRs) help to ensure that only authorized users can access sensitive resources. They grant access based on the user's role, location or other contextual factors. This helps to protect customer data and minimize the risk of unauthorized access or breaches. See the IBM Cloud documentation to learn more about how context-based restrictions work. 

When to use context-based restrictions

The following are some specific scenarios where IBM Cloud context-based restrictions can be used:

  • Role-based access control (RBAC): With RBAC in IBM Cloud, the organization can grant different levels of access and permissions to users based on their roles and responsibilities. For example, developers may have access to development resources, while operations personnel may have access to production resources. RBAC can help ensure that only authorized personnel have access to the resources they need to do their jobs.
  • IP address allow-listing: An organization wants to restrict access to its cloud resources to specific IP addresses. With IP address allow-listing in IBM Cloud, the organization can specify which IP addresses are allowed to access its resources. For example, the organization may whitelist the IP addresses of its headquarters and branch offices, while blocking access from other locations.
  • Geolocation restrictions: An organization needs to comply with local data privacy regulations that require data to be stored within a specific region. With geolocation restrictions, the organization can restrict access to resources in a specific region to users located within that region. This can help ensure that sensitive data is protected and only accessible to authorized personnel within the specified region.
  • Resource-based access control (ReBAC): An organization wants to restrict access to its most sensitive data and resources. With ReBAC, access can be granted to specific resources based on the user's role and permissions. For example, the organization may restrict access to its financial data to only authorized personnel with specific roles and permissions. Additionally, IBM Cloud can audit access to these resources to ensure that only authorized users are accessing them.
Context-based restriction use cases

Here are few use cases for context-based restrictions:

  • Regulatory compliance: Many industries—including finance, healthcare and government—have strict regulatory requirements around data protection and access control. IBM Cloud's context-based restrictions can help organizations comply with these regulations by ensuring that only authorized personnel have access to sensitive data. For example, an organization in the healthcare industry may use geolocation restrictions to ensure that patient data is only accessible to healthcare professionals within a specific region.
  • Application development and testing: In application development and testing environments, context-based restrictions can help ensure that developers and testers have access to the resources they need without exposing sensitive data or resources to unauthorized users.
  • Disaster recovery and business continuity: In disaster recovery and business continuity scenarios, IBM Cloud's context-based restrictions can help ensure that critical resources and data are protected and available to authorized personnel. IBM Cloud can use geolocation restrictions to ensure that backup and recovery resources are only accessible from authorized locations.
Rule implementations 1. Enforce a rule

Context-based restriction rules can be enforced upon creation and updated at any time. Rule enforcement can be of three types:

  • Enabled: Enforces the rule and restricts the access to services based on the rule definition.
  • Disabled: No restrictions are applied to the resources. 
  • Report-only: Allows you to monitor how the rule affects you without enforcing it. All access attempts are logged in Activity Tracker. It is recommended to enable a rule in Report-only mode for 30 days before enforcing the rule. Some of the services do not support this mode (e.g., IBM Cloud Databases resources).

Rules created in report-only mode can be listed using the CLI with the following command:

ic cbr rules --enforcement-mode report
2. Scope a rule

You can narrow the scope of the rule to specific APIs as part of the restrictions to achieve fine-grained security in your system. Only some services support the ability to scope a rule by API. To know the API scopes for specific service, first get to know the list of services supported by using the CLI:

For example, you can use the CLI to view the possible scopes of the IBM Cloud Kubernetes Service:

To create rules with a restricted scope, use the API-types attribute:

3. Restrict access using tags

You can create rules to restrict access to specific instance(s) based on access tags. IBM Cloud resources can be created and accessed with IAM access tags, and these tags can be used to restrict access using context-based restrictions.  To restrict specific VPCs to access IBM Cloud Object Storage service instances that are assigned with tag “env:test”, you can create the rules rule-create command:

ibmcloud cbr rule-create --context-attributes  "networkZoneId=ca1c2bb48b40ed7c595a6ff3ed49f055" --service-name cloud-object-storage --enforcement-mode report --tags "env=test"

Note: You must create the zone before you can create the rules. To create the zone, refer to Creating network zone from the CLI.

4. Restrict using IP addresses

Restrict IP addresses of authorized personnel to access the IBM Cloud resources using context-based restrictions. Create zones for the IP address and create a rule with that zone:

Additionally, you can allow different IP addresses for public and private endpoints of a service. 

5. Create specific geolocation-based restrictions

Access to a service can be restricted in specific locations to impose data residency requirements: 

6. Monitor the rules

To monitor the rules behavior in enabled or report-only mode, refer to Monitoring context-based restrictions.

Conclusion

IBM Cloud's context-based restrictions can help organizations ensure that their cloud resources are protected, compliant and accessible only to authorized personnel. By leveraging these features, organizations can mitigate security risks, enhance compliance and improve operational efficiency.

Resources Josephine Justin

Architect

Norton Samuel Stanley

Lead Software Engineer - IBM Cloud Storage

=======================

Deploy a Globally Distributed Web App on Your Domain with Code Engine and Cloud Internet Services Cloud

5 min read

By:

Sascha Schwarze, IBM Cloud Developer Services
Enrico Regge, Senior Software Developer

How to use a global load balancer to efficiently route requests on a custom domain to the nearest IBM Cloud Code Engine web application.

IBM Cloud Code Engine is the easiest way to deploy and run your source code or container on the IBM Cloud. Our goal has always been to allow you to focus on the development of the application code itself, while Code Engine manages the underlying infrastructure.

Part of that infrastructure is automatically providing a secure and reliable HTTPS endpoint for your application, which includes DNS routing and TLS certificates. Once your application is deployed, it will be accessible via a generic URL like this: https://<myapp>.<randomcharacters>.<region-name>.codeengine.appdomain.cloud. By configuring domain mappings, you can also expose your Code Engine application on your own domain, meaning it would be accessible via https://<myapp>.<mydomain>

This setup is great to serve an application that may reside in a specific region of the world for your localized users. However, if your audience is global, there is a slight issue. All requests will be routed to one specific IBM cloud data center. Assuming you’ve chosen the Frankfurt MZR for your deployment, users in Australia or South America will have to deal with some latency. For a personal or hobby application that might not be a big deal, but it could become an issue for enterprise-grade applications.

To address these use cases, we will combine Code Engine with another IBM Cloud service—IBM Cloud Internet Services (CIS)—that provides capabilities for exposing internet applications in a secure, scalable and reliable manner.

In this blog post, we’ll demonstrate how one specific CIS capability called DNS-based geo-locational routing can be used to reduce latency for a globally distributed Code Engine application and achieve high availability using multiple regions.

Setting up your domain in Cloud Internet Services

Let’s start by creating a Cloud Internet Services instance in your IBM Cloud account. Start by navigating to the appropriate place in the Catalog. There is a free 30-day trial available that provides enough capabilities for our setup.

After your instance is created, you will need to add your domain to it. This process includes delegation of the domain management to Cloud Internet Services. The full set of steps is described in Add and configure your domain. If you haven’t done this already, then it is time for a cup of coffee, because the propagation of DNS records can take a few minutes up to a couple of hours.

Deciding on the application domain

Once your domain is ready for use in Cloud Internet Services, it’s time to decide which domain to use to serve your application. Let’s assume your domain is “example.com” and you allowed your CIS instance to control it in the step above. Obviously, you can use any other domain name, but example.com is what we’ll use for this blog post.

Note: From now on, this example will assume that “example.com“ was handed over to CIS and uses “global-app.example.com” as the domain name for the application. Make sure that you adjust sample commands when you copy them from the following steps.

Generating a certificate for your application domain

The next step is to order a certificate for your domain name. We will use the Let’s Encrypt service:

  1. Install certbot. Certbot is a client for the Automatic Certificate Management Environment (ACME) protocol that Let’s Encrypt uses to verify domain ownership and to hand out certificates. From the instructions page, select Other as software and the operating system of your workstation to get the right instructions to install the certbot command line tool.
  2. Run the following command and adjust it for you domain:
    certbot certonly --manual --preferred-challenges dns --email webmaster@example.com --server https://acme-v02.api.letsencrypt.org/directory --agree-tos --domain global-app.example.com
  3. To verify that you own the domain, you are now required to set a TXT record for the domain that you requested with a value provided by the tool, in above example “_acme_challenge.global-app.example.com”. As the domain’s DNS is already delegated to Cloud Internet Services, we must perform this there:
    • Navigate to Reliability > DNS.
    • In the section DNS records, click Add.
    • Set the Type to TXT.
    • Set the Name to “_acme-challenge.global-app” (yes, without example.com because this suffix is what CIS is managing).
    • Set the Content to the value from the certbot command.
    • Click Add.
    • You can verify that the record was set by running the following command: dig _acme-challenge.global-app.example.com txt
  4. Press Enter in certbot to continue. Certbot now retrieves the certificate that is signed by Let’s Encrypt. It will provide the location where the certificate is stored. The two files that you will need from there are “fullchain.pem” and “privkey.pem”.
Setting up your application in Code Engine

In the next step, we’ll set up the same Code Engine application in three projects that are distributed across the world in three different IBM Cloud regions. I suggest that you take one region that is near to your current location and two locations that are far away. Since I’m located in Europe, I will use Frankfurt, Sao Paolo and Sydney.

Go to Code Engine’s project page, and create three projects. You may want to use a common naming pattern and a shared tag like shown in the screenshot below:

Next, we’ll go into all of three projects and create a simple application:

  1. On the Overview page, click Create application.
  2. Provide a Name (e.g., “global-app”).
  3. As image reference, you can use your own application. For this scenario, I will stay with the IBM-provided hello-world-image “icr.io/codeengine/helloworld” for a good reason—it shows the Code Engine environment variables that indicate the region in which the app runs. This will be interesting once we have a global endpoint for the application.
  4. Scroll down to the Runtime Settings section. In there, set the minimum number of instances to 1. The reason that we require an application that responds instantaneously to health checks is that we will later configure it in Cloud Internet Services.
  5. You can leave all other settings at their default. Finally, click Create.
  6. Wait for the application to become Ready, then click Test application followed by the Send request button. If you used the hello-world-image, then you’ll see a welcome message that includes some environment variables. Look for CE_DOMAIN, which indicates the region in which the app runs:
  7. Back in Code Engine, go to the Domain mappings tab of the application.
  8. Click Create to setup a custom domain mapping for your application:
    • Copy the content of the “fullchain.pem” into the Certificate chain field.
    • Copy the content of the “privkey.pem” into the Private key field.
    • Enter the Domain name (e.g., “global-app.example.com”).
    • Capture the CNAME target. We will not use it to setup a direct CNAME for your domain but will need it for the routing in Cloud Internet Services.
    • Click Create to create the domain mapping. Wait for its status to become Ready. Note: Because the domain setup is not done, you will not be able to access it through the domain yet.
  9. Optionally, you can change the visibility to No external system domain mapping. This will disable the public https://global-app.<randomcharacters>.<region-name>.codeengine.appdomain.cloud endpoint.

We now have the same Code Engine application running in three regions that are mapped to the same domain. Next, we need to make sure that Cloud Internet Services distributes the load to these three regions.

Setting up a health check

Let’s move over to Cloud Internet Services and configure a health check that is suitable for the application:

  1. Navigate to Reliability > Global load balancers > Health checks.
  2. Click Create.
  3. As Name, I suggest using the application name (e.g., “global-app”).
  4. Set the Monitor Type to HTTPS and the Port to 443.
  5. The other settings can be kept at their default assuming you are using the hello-world-image. If you are using your own application, then you may need to adjust the Path and further settings within the Advanced options. Ideally, you implement a dedicated health-check endpoint in your application that you can call here.
  6. Click Create.

Please see Setting up health checks for more information.

Configuring the origin pools

As a next step, let’s define one origin pool per region:

  1. Navigate to Reliability > Global load balancers > Origin pools.
  2. Click Create.
  3. As Name, I suggest using APP_NAME-REGION (e.g., “global-app-au-syd”).
  4. As Origin name, I suggest using the region (e.g., “au-syd”).
  5. Set the Origin address to the CNAME target that you have captured during the domain mapping setup in Code Engine. The value is “custom.<randomcharacters>.au-syd.codeengine.appdomain.cloud”.
  6. Set the Host header to your domain name (e.g., “global-app.example.com”).
  7. Under Health check, select Existing health check. Then select the “global-app” health check that you previously created.
  8. The Health check region drop-down is now unlocked. Select a region that is near the Code Engine region (e.g., Oceania for Sydney).
  9. Click Save.

After the origin pool is saved, it will initially be shown with critical health. After around two minutes (at most), it should change to healthy. A manual page refresh in your browser may be necessary to see the status update.

Before moving on to the next section, make sure to create an origin pool for each region that you want to address.

Configuring the load balancer

Finally, you can set up the load balancer:

  1. Navigate to Reliability > Global load balancers > Load balancers and click Create.
  2. The Name defines the domain or your application. As I am controlling “example.com” in Cloud Internet Services, I must enter “global-app” so that the domain will be “global-app.example.com”.
  3. Set Traffic steering to Geo.
  4. Add Geo routes:
    1. You can define a geo route for all Cloud Internet Services regions. In all of them, add all of the origin pools you created. Sort them so that the nearest Code Engine region is having top priority. In my case, for Oceania, the origin pool of Sydney came first. In Eastern and Western Europe, I put Frankfurt to the top. In Northern and Southern South America, I chose Sao Paolo.
    2. If you chose to define a geo route only for some Cloud Internet Services regions, you must add a route for the Default region where you can select also available origin pools of the application. This route will be the fallback.
  5. Click Create to create the load balancer.

And that’s it. We have set up an application that will be routed to the nearest Code Engine region.

Seeing it in action

Open your domain in your browser. Using the CE_DOMAIN environment variable, you can see which Code Engine region you are targeting. If the region does not match your expectation, it could have one of these reasons:

  • Are you inside an enterprise network? They sometimes use internet gateways far away from your physical location.
  • Are you running with a VPN connection? The VPN server may be in a different geo region.
  • Are you using an HTTP proxy? Again, that proxy server may be far away.

Public internet services such as https://www.iplocation.net/ can help you to determine where the IP address that you are using to connect to the internet is located.

I ended up, as expected, in Frankfurt.

High availability

Now, let’s do an exercise. Let’s assume there is a regional problem, and for whatever reason, your application is down in one region. Will CIS still route the request?

Let’s delete the Code Engine application in the region that is currently being targeted. In my case, I am deleting the “global-app” from the Code Engine project in Frankfurt. Directly after the app is deleted, refresh the browser. Assuming you were quick enough, you will now see a failure. At this point in time, the target application in Frankfurt is down, but the health check that runs in Cloud Internet Services every 60 seconds by has not yet determined this. Eventually, the endpoint will be functional again. You now reach the Code Engine region that is the second priority original pool in the geo route. In my case, this is Sydney..

Note: 60 seconds is the default and the minimum interval for the Cloud Internet Services Free and Standard plans. You can reduce this to five seconds in the advanced options of the health check if you are using an enterprise plan.

You can now recreate the application in Code Engine including the domain mapping. While creating the domain mapping, you can select the existing secret that still exists assuming you did not delete it. After at most a minute, the app will be served again from Frankfurt.

Note: Browsers will, by default, reuse the existing connection. Given the other region is still alive, you may still get the response from there (in my case, from Sydney). A browser restart may be required to force using a new connection. Alternatively, use another browser or curl from the command line.

Conclusion

We have set up an IBM Cloud Code Engine application in several IBM Cloud regions and used its domain mapping feature together with IBM Cloud Internet Services to set up a global endpoint that is routed by geo location to provide optimized latency and high availability.

What’s next?

If you have feedback, suggestions, or questions about this post, please reach out to us on StackOverflow by using one of the following tags "ibm-cloud" or "ibm-cloud-code-engine".

Sascha Schwarze

IBM Cloud Developer Services

Enrico Regge

Senior Software Developer

=======================

IBM Security QRadar EDR: How to Prevent Ransomware and Protect Your Business Security

6 min read

By:

Pooja Parab, Brand and Content Strategist

As ransomware attacks get faster and attackers get more efficient in carrying out the attacks, it’s critical to prevent ransomware and limit its impact. How? With early detection and faster responses.

The time to carry out ransomware attacks dropped by 94% over the last few years, according to the IBM Threat Intelligence Index (TII) 2023 report. This means that the average time it took to deploy a ransomware attack went from over two months in 2019 to just under four days in 2021.

Is there anything we can do about this? Yes, but first let’s find out more about ransomware.

What is a ransomware attack?

Ransomware is a form of malware that prevents a user or organization from accessing their own files on their computer and keeps them locked until a ransom is paid to the attacker.

Giuseppe Bonfa, Client Technical Support Engineer at IBM Security explains, “Ransomware is the final act of a full infrastructure-wide breach. Typically, the attacker will move across the network, trying to reach the most sensitive assets and data, and once they find them, they will run the attack. While the initial breach might happen by a simple workstation, it can have disastrous effects on the whole network.”

Evolving ransomware attack scenarios

Talking about the current ransomware threat landscape, Bonfa adds, “Nowadays, ransomware does not come alone—it’s followed by data exfiltration and information leakage on the dark web.”

While early ransomware attackers typically had a ransom demand to unlock the data, today, when attackers see a weakness, they exploit it. According to the TII report, whether it’s ransomware, business email compromised (BEC) or distributed denial of service (DDoS), 27% of attack vectors were extortion related. As extortion gets more personal, ransomware attacks are just the tip of the iceberg as cybercriminals incorporate severe psychological pressure in their attack methods.

The payment for the earliest ransomware variants used to be sent by snail mail, whereas today, cybercriminals demand payment to be sent via cryptocurrency or a credit card. Some ransomware attackers sell the service to other cybercriminals, known as Ransomware-as-a-Service or RaaS.

Types of ransomware

There are two general types of ransomware:

  • Encrypting ransomware (or crypto ransomware): This type holds a victim’s sensitive data hostage by encrypting it.
  • Locker ransomware: This locks a user’s entire device.

Ransomware can be further classified into subcategories like leakware/doxware, mobile ransomware, Ransomware-as-a-Service, wipers ransomware and scareware. Whichever ransomware type a threat actor uses, their primary objective is to gain access and encrypt a user’s files and data so they can’t access them.

Ransomware prevention: How to prevent ransomware attacks

Interestingly, a demand for payment is the last stage of a ransomware attack. Hackers, first and foremost, will spend months or even years gaining access to the network before finally sending a ransom note. While a ransomware attack is difficult to identify before ransomware is deployed on the system, the way to stop ransomware begins with early detection.

It is vital to understand that traditional signature-based antivirus software is not enough to protect businesses against sophisticated ransomware or malware attacks. Attackers avoid using signature-based malware that can be blocked by an antivirus or a firewall.

Leveraging a powerful endpoint detection and response (EDR) solution like IBM Security QRadar EDR can help detect and remediate advanced ransomware threats in seconds. Unlike antiviruses, EDR solutions don’t rely on known signatures and can detect unknown or fileless threats.

Four ways the IBM Security QRadar EDR solution can help prevent ransomware

The IBM Security QRadar EDR endpoint security solution can help protect your organization by detecting a ransomware attack in the early attack stages. Let’s find out how.

1. Behavioral detection helps understand ransomware attacks better

IBM QRadar EDR uses intelligent automation, artificial intelligence (AI) and machine learning to detect new and advanced threats in near real-time. IBM QRadar EDR identifies anomalous activities like ransomware behavior (e.g., an unusual backup deletion or encryption process that suddenly starts without warning and automatically terminates it upon detection).

This way, even as new ransomware variants emerge, IBM QRadar EDR uses data mining to empower security teams to automatically hunt for threats that share similarities at the behavioral and functional levels with other incidents and respond accordingly. This delivers the results in just seconds and helps facilitate the discovery of dormant threats that could dwell in an environment but may otherwise go unnoticed for months or even years, waiting to be used by an attacker. Infected devices and threat activity can also be isolated to catch lateral movement.

Moreover, IBM QRadar EDR security also provides security teams with a behavior-tree visualization that provides detailed behavioral analytics and full attack visibility. This helps analysts view the breadth of the cyberattack on a single screen, helping them respond faster.

Full attack visibility shows the scope of the ransomware attack so analysts can respond accordingly.

2. Threat hunting for ransomware helps gather actionable threat intelligence

The IBM QRadar EDR can quickly determine if new threats have entered an environment and help security teams identify the “early warning signs” of an attack and patch weak spots. IBM QRadar EDR helps track in-memory and fileless threats that are especially harder to follow when attackers use different ransomware variants and move within a large infrastructure. The threat-hunting capabilities of the IBM QRadar EDR endpoint detection solution allow a real-time, infrastructure-wide hunt for the presence of indicators of compromise (IOCs), binaries, and behaviors and remediate them.

An endpoint security platform like IBM QRadar EDR helps reduce investigation time from minutes to seconds with threat intelligence and analysis scoring. Analysts can identify potential threats with metadata-based analysis to expedite triage.

3. Mitigating cyber threats with offline ransomware protection

With the shift in work trends and an increase in the number of endpoints, employees are used to working on the internet or a virtual private network (VPN) connection that ensures secure access to the network. Unlike some EDR security tools that require a connection with a back-end server to offer full protection, IBM QRadar EDR helps protect against ransomware even if there is no working internet connection. This capability is critical when the user may accidentally open a document with a ransomware infection while traveling. An AI-driven EDR solution like IBM QRadar EDR blocks the ransomware automatically upon detection and prevents encryption.

4. Detecting and responding to processes downloaded from phishing emails

Phishing, a form of delivery for ransomware or malware, is the top infection vector for attackers, with more than half of phishing attacks using spear-phishing attachments to gain access, according to the TII report. The IBM QRadar EDR solution helps protect organizations against malicious emails by providing deep visibility into processes and applications that run on endpoints. With IBM QRadar EDR, security teams can detect any binary or process that is downloaded and launched from faulty links or malicious attachments and block them. It also provides protection against malicious software that is auto-downloaded to your endpoint or runs in the background.

With its fast endpoint detection and malware reporting, IBM QRadar EDR can help reduce the overall impact of any type of malware attack to save both time and expenses for businesses. This blog post demonstrates how IBM QRadar EDR can detect and respond to malware the minute a suspicious email is clicked.

While endpoint security should not be the sole protection to your threat detection cybersecurity strategy, it should still be the initial mechanism (along with an extended detection and response security solution) to identify suspicious malware behavior.

Best practices for ransomware prevention
  1. Enhance your security posture by conducting a security assessment, minimizing your attack surface and mapping against known and potential vulnerabilities.
  2. Establish security awareness among employees about the risk of macros in email attachments and ensure email security is maintained by blocking macros from running in Microsoft Office apps.
  3. Adopt a zero-trust framework to make it harder for attackers to move laterally throughout compromised assets.
  4. Maintain an aggressive and current security patch management policy, particularly with browser vulnerabilities like Adobe Flash and Java that are commonly used by employees.
  5. Use multifactor authentication (MFA) whenever possible to ensure stolen passwords or default login credentials aren’t readily usable to attackers.
  6. Develop and rehearse an incident response plan so your business can act quickly and effectively if a stressful situation relating to threats or disruptions arises.
  7. Maintain offline data backups to prevent data loss and recover quickly in case of emergencies.
  8. Implement an email security solution that conducts attachment sandboxing and URL filtering before malicious links containing ransomware can be delivered.
Get started with IBM QRadar EDR Pooja Parab

Brand and Content Strategist

=======================

How High Performance Computing Can Continue to Transform Financial Services Cloud

3 min read

By:

Alan Peacock, General Manager, IBM Cloud Delivery & Operations

In today’s evolving business landscape, the ability to conduct analyses and make decisions quickly is critical to remaining competitive.

These capabilities are even more important for highly regulated industries, such as financial services, where the derived insights can be used to help reduce risk. As financial services organizations look to drive innovation while keeping data secured, we have historically seen them embrace high performance computing (HPC) to help quickly conduct risk analysis, make decisions faster and meet their regulatory requirements.

Now more than ever, we are seeing financial services institutions increasingly leverage HPC for more capabilities, including to power artificial intelligence (AI) and machine learning solutions that can be used to help enterprises make more informed decisions.

IBM Cloud® HPC is designed to help clients perform complex calculations to quickly provide insights and gain a competitive advantage, while still prioritizing the security of their data. With security and controls built in to our platform, IBM Cloud HPC also allows clients to consume HPC as a fully managed service while helping them address third- and fourth-party risks. As financial services continues to transform, IBM remains committed to its mission to help reduce risk for the industry with resiliency, performance, security, compliance and total cost of ownership at the forefront.

Using HPC to transform risk management

In the world of financial services, having as much compute power as possible is key for peak performance. There are numerous situations that can drastically increase the computational capacity financial firms must provide on a daily basis, including competitive pressures, regulatory reporting, security demands and market swings that require them to perform complex calculations to reassess risk quickly.

For example, banks must be able to assess risk on an ongoing basis and report to regulators. In the past, we have seen banks long rely on Monte Carlo simulations—calculations that can help predict the probability of a variety of outcomes against multiple potential variables—to generate a comprehensive view of risk.

Yet as risks grow, we expect regulatory agencies across financial services markets to increasingly demand a higher degree of certainty of outcome from the models. To achieve the desired accuracy, models need more data and significantly more iterations and computations. This is where we believe HPC—which can deliver intensive computer power at scale—can be especially useful. To support the demands of today’s regulatory standards, HPC is designed to help financial services deliver the performance levels these computationally intensive calculations require, whether located on-premises or in the cloud.

Leveraging AI and machine learning powered by HPC

As consumers, we put a lot of trust into banks and know they have access to a variety of data—ranging all the way from transactions and credit scores to things like our ages. While having all of this data is important to help deliver the types of high-quality and frictionless customer experiences we expect in today’s digital-first world, banks must first convert this data into actionable insights to create true value.

To analyze this data quickly and efficiently, banks can turn to AI and machine learning solutions powered by HPC. While HPC is not new to the financial services industry (with its long history in allowing organizations to perform calculations), now is the time for banks to transform and leverage HPC for a new era that is driven by AI and machine learning. By using HPC to power AI and machine learning, financial services institutions can make sense of data to help them make key business decisions.

For example, banks can leverage HPC-powered AI for critical situations like fraud detection or for non-urgent actions like customer service. In these scenarios, algorithms can identify unusual, potentially fraudulent activity on a customer’s credit card or detect specific sentiments (such as frustration or annoyance) in a customer service interaction powered by virtual assistants. In both scenarios, the algorithm can help trigger an action, such as flagging the suspicious credit card activity to the customer via text or transferring the customer to a live agent for further help before their frustrations escalate.

As the financial services industry continues to transform, banks can leverage HPC to help calculate their risk position on a daily basis to their regulators. If financial institutions can generate increasingly more accurate risk analyses and embrace next-generation technologies like AI and machine learning (all powered by HPC), they can drive their innovation goals while also addressing their regulatory requirements.

Get started

Learn more about IBM Cloud® HPC.

Alan Peacock

General Manager, IBM Cloud Delivery & Operations

=======================

Enhancing the Speed of AI Inferencing with the Power10 Chip Artificial intelligence

1 min read

By:

Rodrigo Ceron, Senior Managing Consultant

An inferencing model is one that is used to find insights on data through being trained on how to find patterns of interest in the data.

Inferencing does not need as much compute power as compared to the compute power required in training an artificial intelligence (AI) model. Thus, it is totally possible—and even more energy efficient—to inference without any extra hardware accelerators (such as GPUs) and even perform on edge devices. It is common to have AI inferencing models run on smart phones and similar devices just by using the CPU. Many picture and face filters on social media phone apps are all AI inferencing models.

IBM’s Power10 chip

IBM was the pioneer in adding on-processor accelerators for inferencing in its IBM Power10 chip, called the Matrix Math Accelerator (MMA) engines. This gave the Power10 platform the ability to be faster than other hardware architectures without the need to spend an extra watt in energy with added GPUs. The Power10 chip can extract insights from data faster than any other chip architecture and consumes much less energy than GPU-based systems, and that is why it is optimized for AI.

Leveraging IBM Power10 for AI, especially for inferencing, does not require any extra effort from AI DevOps teams. The data science libraries—such as openBLAS, libATen, Eigen and MLAS, to name a few—are already optimized to make use of the MMA engines. So, AI frameworks that leverage these libraries—such as Pytorch, Tensorflow and ONNX—already benefit from the on-chip acceleration. These optimized libraries are available through the RocketCE channel in anaconda.org.

IBM Power10 can also speed up inferencing by using reduced-precision data. Instead of feeding the inference model with 32-bit floating point data, one can feed it with 16-bit floating point data, for example—filling the processor with twice as much data for inferencing at the same time. This works well for some models without prejudice in the accuracy of the inferenced data.

Inferencing is the last stage of the AI DevOps cycle, and the IBM Power10 platform was designed to be AI-optimized, thus helping clients extract insights from data in a more cost-effective way both in terms of energy efficiency and reducing the need for extra accelerators.

Learn more

If you want to learn more about inferencing on Power10, please reach out to IBM Technology Expert Labs at technologyservices@ibm.com.  

Rodrigo Ceron

Senior Managing Consultant

=======================

5 Proactive Steps to Secure Your Supply Chain Security

4 min read

By:

Dimple Ahluwalia, VP & Global Senior Partner, Security Services

How a solution like IBM Security Supply Chain Cyber Risk Management Services can help protect supply chains that are vulnerable to a cyberattack chain reaction.

For cybercriminals, the supply chain presents an extremely enticing target. Comprised of multiple vendors, manufacturers and other third-party organizations (each with access to the same data and systems) there’s potential for a real domino effect of destruction when it comes to a data breach or cyberattack. One single successful cyberattack on a supply chain has the potential to not only significantly impact an organization’s operations but lead to disruption with business partners and financial losses across the board. That’s not even considering the long-lasting ramifications of reputational damage with both partners and consumers.

Cyberattacks in manufacturing and supply chains

According to the 2023 IBM Security X-Force Threat Intelligence Index, manufacturing saw the highest number of extortion cases across all industries (at 30 %), and more than one-quarter of attacks overall were extortion-related—whether ransomware, business email compromise (BEC) or DDoS. With its low tolerance for downtime and sensitivities to double-extortion tactics, manufacturing makes an attractive target for cybercriminals.

More than half of security breaches are attributed to supply chain and third-party suppliers, at a high average cost of USD 4.46M. As a complex network that is constantly changing and evolving, it can be difficult for an organization to stay up to date on the latest cybersecurity threats and to identify potential vulnerabilities in their supply chain. When cyberattacks do occur, it can be challenging to determine which entity is the source of the security breach. Confusion can slow response time, and when it comes to a data breach, every second counts.

According to the IBM Security X-Force Threat Intelligence Index, while there was a slight decline in ransomware attacks, the time to execute attacks dropped 94% over the last few years. What used to take months now takes attackers mere days. With attackers moving faster, organizations must take a proactive, threat-driven approach to cybersecurity.

So, why are supply chains so vulnerable? In short: the impact from a cyberattack or data breach is potentially devastating. Organizations in the supply chain know they are vulnerable, and so do the cybercriminals.

One of the best ways to guard against cyberattacks is to understand where and how they are happening. When considering cyber risk management, the various types of cybersecurity incidents that can adversely impact a supply chain are phishing attacks, malware infections, data breaches and ransomware attacks.

How to secure your supply chain

Securing your supply chain through cyber risk management is crucial in today's digital landscape. Many organizations currently have a fragmented approach to supply chain security and are faced with challenges like risk identification and management, assessment of third-party software, limited threat intelligence for timely decision-making, and a lack of operational resilience. Taking a proactive approach that is well-defined, adaptive and optimized by data and AI is one of the most important things supply chains can do to bolster their cybersecurity stance.

To secure your supply chain, consider implementing the following five leading practices for developing a cyber risk management plan:

  1. Conduct risk assessments: Regularly assess the cyber risks associated with your supply chain—including the systems and processes used by your suppliers. Identify any vulnerabilities and prioritize the most critical ones with greater business impact for mitigation.
  2. Establish security protocols: Set clear security protocols for your suppliers, including guidelines for data protection, access control and incident response. Ensure that your suppliers have the necessary security measures in place, such as firewalls, encryption, strong passwords and multi-factor authentication.
  3. Implement continuous monitoring: Continuously monitor your supply chain for any security incidents, including hacking attempts, data breaches and malicious software infections. Establish an incident response plan in case a security breach occurs and periodically run tabletop or immersive exercises to strengthen muscle memory for when it comes time to execute the plan.
  4. Encourage supplier education: Most organizations educate their workforce on cybersecurity-related topics and practices to safeguard company data and assets. If structured learning is not offered by your supplier, consider options to either extend training and education to your suppliers on cybersecurity best practices and the importance of protecting sensitive data, or point them to free resources. Encourage them to adopt robust security measures and to be vigilant against cyber threats.
  5. Regularly review and update policies: Regularly review and update your cyber risk management policies to ensure they are up-to-date and relevant. This will help you stay ahead of evolving threats and maintain the security of your supply chain.
Learn more about IBM Security Supply Chain Cyber Risk Management Services

Securing your supply chain is a journey, and IBM can be your trusted partner. Launching today, IBM Security Supply Chain Cyber Risk Management Services can help organizations develop a comprehensive approach to identify and mitigate security and regulatory risks that their current and potential suppliers may carry.

Learn more about securing the supply chain in this upcoming webinar or schedule a consultation here.

Want to better understand how threat actors are waging attacks and learn how to proactively protect your organization? Read the full 2023 IBM Security X-Force Threat Intelligence Index and view the Threat Intelligence Index Action Guide for insights, recommendations and next steps.

Dimple Ahluwalia

VP & Global Senior Partner, Security Services

=======================

Sustainability as a Strategic Business Imperative

2 min read

By:

Jonas Ullberg, VP, Cognitive Systems - Global Channel Sales

Sustainability is driving business strategy.

Sustainability today is a strategic business imperative. Nearly 60% of CEOs tell us they see significant demand from their investors for greater transparency on sustainability[1]. Sustainable businesses perform better as well. 80% of those same CEOs say their investments in sustainability are expected to improve business results over the next five years[2].

The sense of urgency to take the action is generating increased interest in clean energy. For we see that cloud providers are increasingly taking on the practice of scheduling their energy-intensive computing workload to be executed when their data centers are powered by renewable energy. However, while switching to cleaner energy may lower emissions of greenhouse gases, reducing the consumption of energy is another important action.

How can IBM Power support enterprise sustainability goals?

IBM has a long history of environmental commitment including a focus on design for environment across its product portfolio. IBM’s Power10 processor-based servers are the ideal platform to assist clients who wish to leverage more efficient IT to help achieve their sustainability ambitions. With 2.5x better per core performance than compared to x86 servers [3], IBM Power provides clients with an opportunity to consolidate their existing data center footprint. This can result in lower energy consumption; for example, the IBM Power E1050 provides comparable performance and uses 50% less energy at maximum input power than the compared x86-based server.[4]

Power Private Cloud with Dynamic Capacity is a flexible option to further optimize energy use based on business needs. Power servers also have several energy management and monitoring tools available to assist clients. IBM EnergyScale, an energy optimization advisor, helps users understand electrical and cooling usage. This helps enable oversight of important performance parameters such as facility planning, energy, and cost savings.[5]

Building on a strong partnership for future sustainability solutions

More and more clients I meet with share with me their continued focus on all aspects of data center sustainability as strategically important to their overall business. I talk to them about how IBM Power can be an excellent choice for data centers looking for energy efficiency and scalability and how we are continuing to innovate to remain a great choice in the future. Power10 processor-based servers are designed for business-critical workloads and overall data center efficiency and flexibility.

Our global ecosystem of partners is excited about Power10. They are helping our clients explore the innovation built into our technology and run mission critical workloads on the IBM Power platform to help them meet their requirements for performance, security, and availability. Learn about the latest with IBM Power here and other sustainability solutions from IBM here.

 

 

[1]IBM Institute for Business Value (IBV) Research Insights, Sustainability as a transformation catalyst – trailblazers turn aspiration into action, January 2022

[2]IBM Institute for Business Value (IBV) Research Insights, Sustainability as a transformation catalyst – trailblazers turn aspiration into action, January 2022

[3] SPECInt Math: (Power10 2170 peak /120 core)/(1620 peak/224 cores)=2.5. Max System SPECint IBM Power E1080 (3.55-4,0 GHz, Power10) 120 Cores, 8 CPUs SPECint Score 2170 per CPU Score 271.25 per Core Score 18.08 Date: As of Sept 2, 2021. Max System SPECint Hewlett Packard Enterprise Superdome Flex 280 (2.90 GHz, Intel Xeon Platinum 8380H) 224 Cores, 8 CPUs Intel Xeon Platinum 8380H Speed 2900 Mhz SPECint Score 1620.00 per CPU Score 202.50 per Core Score 7.23 Date: Feb-2021 Link: CPU2017 Integer Rate Result: Hewlett Packard Enterprise Superdome Flex 280 (2.90 GHz, Intel Xeon Platinum 8380H) (test sponsored by HPE) (spec.org)

[4] Source: IDC; Performance is based on Quantitative Performance Index (QPI) data as of July 18, 2022, from IDC available at https://www.idc.com/about/qpi. IBM Power E1050 (4x24c Power10) QPI of 192,831 versus HPE Superdome Flex 280 (8x28-core Xeon 8280M) QPI of 187,005. Energy consumption is based on maximum input power: IBM Power E1050 with maximum power of 5,200 W. https://www.redbooks.ibm.com/redpapers/pdfs/redp5684.pdf Superdome Flex 280 with maximum power of 10,540 W https://www.hpe.com/psnow/doc/a00059763enw?jumpid=in_lit-psnow-red

[5] IBM EnergyScale for Power10 Processor Based Systems, https://www.ibm.com/downloads/cas/E7RL9N4E

Jonas Ullberg

VP, Cognitive Systems - Global Channel Sales

=======================

Making IBM Cloud for Financial Services Work for You Cloud

5 min read

By:

Tony Erwin, Senior Technical Staff Member
Erick de Carty, Principal Product Manager, Cloud for FS

See how the unique industry-specific capabilities of IBM Cloud for Financial Services are designed to help you reduce risk and accelerate cloud adoption.

Are you responsible for developing, deploying or managing applications and data in the financial services industry? Do you spend a lot of time worrying about all the associated risks, compliance standards and regulatory requirements? Would you rather spend more time focused on how to deliver value to your clients? If so, keep reading to learn how IBM Cloud for Financial Services® can help you mitigate risk and accelerate your adoption of the cloud.

IBM Cloud® is well-suited for regulated workloads with its end-to-end cloud security capabilities and support for a wide-range of compliance programs. IBM Cloud for Financial Services extends the capabilities of IBM Cloud to provide an industry-driven cloud platform that supports the unique requirements of the financial services industry. It hosts a rich ecosystem of IBM Cloud and partner services that makes it easier to achieve and demonstrate regulatory compliance postures for your financial services workloads.

In addition, the IBM Cloud Framework for Financial Services provides the following accelerators to help you effectively use IBM Cloud for Financial Services to host even your most sensitive and mission-critical workloads:

  • A comprehensive, first-of-its-kind set of control requirements designed to help address the security and regulatory compliance obligations of financial institutions.
  • Detailed implementation guidance for each control requirement to go hand-in-hand with detailed reference architectures.
  • Automation to make it easier to deploy and configure the reference architectures.
  • Tools that enable you to efficiently and effectively monitor compliance, remediate issues and generate evidence of compliance.

Learn more about each accelerator in the sections that follow.

Industry-specific control requirements

The framework’s 565 control requirements serve as the foundation for the IBM Cloud  for Financial Services, and they cover administrative, technical and physical concerns common across the financial services industry. The control requirements were initially based on NIST 800-53 and have been enhanced significantly based on collaboration with major financial institutions around the world. As the regulatory landscape changes, we continue to update the framework based on evolving industry standards and feedback from our partners. In addition, we have partnered with organizations like the Cloud Security Alliance (CSA) to map the control requirements to the CSA's Cloud Controls Matrix (CCM), a cybersecurity control framework for cloud computing that helps to address third- and fourth-party risk in the cloud.

IBM Cloud provides a rich set of data centers, infrastructure and services which have evidenced compliance to the control requirements and have been designated as IBM Cloud for Financial Services Validated. This means you can use these components for your financial services workloads knowing that the control requirements are integrated into the technology stack. And keep in mind that all IBM Cloud services are designed with security in mind, and many are certified with other compliance programs, such as ISO, SOC, etc. So, even cloud services that are not yet Financial Services Validated may be considered for use in your solutions depending on your use case, sensitivity of data, etc.

Furthermore, we have a growing partner ecosystem of services and software that have received the Financial Services Validated designation. This means you may use these offerings within your solutions and spend less time and effort vetting third-party risk and compliance.

Guidance and reference architectures

The framework also provides detailed implementation and evidence guidance for each control requirement. The guidance provides the information you need to design, develop, deploy and manage your applications in a way that meets the security and regulatory requirements defined by the control requirements. Along with the extensive deployment and configuration guidance that takes advantage of a shared responsibility model, three pre-defined reference architectures (shown below) are provided. These architectures demonstrate how to stitch together Financial Services Validated ecosystem components and serve as a secure basis for running your own financial services workloads on IBM Cloud:

Automated deployable architectures

The framework also provides Infrastructure as Code (IaC) using Terraform—a declarative open-source tool for provisioning and infrastructure orchestration—to automate deployment of the VPC reference architecture on IBM Cloud. This enables you to deploy a reference architecture with greater speed, less risk and reduced cost.

The automation is provided as a toolchain through IBM Cloud Continuous Delivery® to help you build out a secure software development lifecycle (SDLC). The toolchain injects the Code Risk Analyzer component of Continuous Delivery into your workflow for code and security scanning.  This is an example of “shift left” (DevSecOps) where security and vulnerability checks are added earlier in the development lifecycle. In this case, Code Risk Analyzer will analyze your Terraform against a set of compliance checks mapped to a subset of control requirements. If any of them fail, your Terraform is not executed. This helps to ensure your deployments are secure by default.

If you want to try it out, visit Deploy infrastructure as code on the IBM Cloud for Financial Services.

Continuous compliance monitoring

Once you’ve deployed your solution, it’s very important to ensure your continued compliance against the control requirements and associated guidance. With IBM Cloud® Security and Compliance Center, you can integrate daily, automatic compliance checks into your SDLC to monitor for possible security flaws and changes in baseline configurations that need corrective action. Unlike Code Risk Analyzer, Security and Compliance Center runs its tests against a live system.

Security and Compliance Center includes a pre-defined IBM Cloud for Financial Services profile that offers a set of automated tests appropriate for the VPC reference architecture. These tests are mapped to a growing subset of control requirements. While a successful scan does not ensure overall regulatory compliance, it provides a powerful point-in-time statement of your current posture against the control requirements for a specific group of resources against a robust set of baseline tests.

Conclusion

This post shows how the unique industry-specific capabilities of IBM Cloud for Financial Services are designed to help you reduce risk and accelerate cloud adoption.  You’ve also seen how the resources in the IBM Cloud Framework for Financial Services—control requirements, implementation guidance, reference architectures, automated deployments and continuous compliance monitoring—allow you to make the IBM Cloud for Financial Services work for you as you build your own financial services applications. Our goal for these resources and tools is to free up your resources so that you can focus on core competencies and drive innovation for yourself and your clients.

If you’re ready to discuss and align your strategic initiatives, assess your cloud risk or leverage IBM Cloud for Financial Services as a force multiplier, connect with an IBM Cloud expert. In addition, if you represent a financial institution and want to collaborate on reducing the risk of cloud consumption across the financial services industry, we invite you to become a member of the Financial Services Cloud Community.

Tony Erwin

Senior Technical Staff Member

Erick de Carty

Principal Product Manager, Cloud for FS

=======================

IBM CIO Organization's Application Modernization Journey Cloud

4 min read

By:

Sachin Avasthi, STSM, Lead Architect for Application Modernization
Jay Talekar, STSM, App and Data Modernization

Exploring the IBM CIO organization’s application modernization journey.

For last decade or so, many enterprises have migrated their IT applications from their on-premises data centers to cloud. As-is migrations reaped the early benefits like operational cost reduction and scalability. But applications designed and developed on on-premises environments failed to provide business agility, higher availability or better performance.

The next chapter in the cloud transformation journey is to optimize and operate more efficiently, enabled by a hybrid cloud strategy. Modernizing your applications to be more cloud-native is the key to achieving cost-efficient agility, increasing resiliency and improving developer productivity while fostering innovation.

LeanIX conducted a survey of Enterprise Architects and Leaders from 140 companies on IT priorities for 2022. This survey showed that 41% of the respondents selected “Reducing technical debt and upgrading legacy systems” as their top priority, compared to the 24% of respondents who selected “Cloud migration and expanding the cloud infrastructure.”  The focus is shifting to modernizing your legacy systems to address the technical debt.

The IBM CIO organization has undertaken a large transformation to modernize traditional applications and infrastructure to a hybrid cloud platform. The CIO Hybrid Cloud—built with Red Hat and IBM technologies—is the engine that fuels the future of IT for the IBM enterprise. Application modernization combined with our hybrid cloud strategy enabled us to operate with speed, scale, security and simplicity. Before diving deep into how the IBM CIO undertook this mammoth activity, let’s first talk about what application modernization is and why it is needed.

What is application modernization?

Simply put, application modernization is a set of tasks or activities that are performed to make the application function better. It could be that an application requires automation of repetitive tasks or needs to upgrade to the next level of software, needs to be redeployed, or needs to be rewritten. As you can see, the answer really depends on each individual application. To summarize, application modernization involves the following steps:

  • Improving your current application architecture or development practices.
  • Simplifying the maintenance and management of the current application.
  • Making your application more sustainable.
  • Migrating to a new platform to take advantage of new, innovative technologies either by moving completely or partially.
Drivers for application modernization

Now that we have defined application modernization, we ask ourselves why we need to do it. As in life, we use modern technologies like smartphones, tablets, etc., and then we upgrade them to either the latest software or the latest hardware to make our life easier, more efficient and environmentally sustainable. Similar concepts apply to an enterprise and any IT application. Drivers for modernization are many and varied. They not only help you stay ahead of the competition but also provide the following:

  • Business opportunities due to market demands and trends
  • The agility to adopt changing business processes by implementing microservices-driven architecture with shared services
  • Consolidation of data centers by streamlining technology platforms and infrastructure independence for applications
  • Use of automation and emerging technologies to drive simplification of business and IT processes.
  • Increased cost efficiencies by removing functional and application redundancies and improving CI/CD practices.

Once we have identified why we need to modernize the application, we need to then define what it means for us. Modernization is not only migrating (rewriting or redesigning or replatforming); it is also adapting (improving DevSecOps, non-functional, user experience, developer experience, etc.). It is imperative that we provide multiple approaches for the businesses on how to modernize their suite of applications.

The application modernization journey

The CIO modernization journey started with an assessment of our IT portfolio—mainly around as-is discovery of the applications on existing platforms and technology stacks—and understanding business criticality and end-user pain points. This analysis helped us to understand the business value of the applications to determine what to do with them: retire, retain, relocate, rehost, replatform,or refactor. For those business applications that should remain, moving to standard modern software development practices makes it easier to take the next steps in modernization.

Because expansive modernization is time-consuming and costly, we decided to perform runtime or operational modernization first. We focused on choosing the right deployment platform and cloud-native technologies during this phase. One such example is migrating J2EE applications from on-premises traditional WebSphere servers to lightweight WebSphere Liberty containers. This runtime modernization is performed with minimal code changes, making the legacy applications portable and resilient. But most importantly, it paved the road for architecture modernization from monoliths to microservices and from bloated virtual machines to containers:

IBM CIO manages a large portfolio of business-critical applications, some of which were created in the 1990s. Our biggest challenge was to scale this modernization approach across hundreds of applications. We created reusable modernization patterns based on the technology stacks and provided a detailed planning blueprint. Maturing our DevSecOps practices and increasing the automated testing helped tremendously in accelerating the production deployments.

As an example, we modernized a workflow-based web application that records and tracks the engagement of IBM Consulting Solution Managers on sales opportunities. The web application was hosted on a cluster of virtual machines (VMs) with a technology stack of traditional IBM WebSphere, IBM Db2, IBM MQ and IBM Filenet. It was transformed from being hosted on a VM where the databases and document repositories were deployed standalone to IBM Db2 on Cloud and IBM Cloud Object Storage.

We first transformed the web application from traditional WebSphere to WebSphere Liberty containers on a CIO-wide managed Red Hat OpenShift platform. We then migrated databases to Db2 as a Service on IBM Cloud. The application was further optimized to run and deploy business rules on ODM (Operational Decision Manager) on IBM Cloud. The modernization led to a reduction in latency and improved the overall performance of the web application. Utilizing shared services as a service model led to cost reductions and higher resource availability. This not only improved the DORA metrics and user satisfaction, but the team upskilled themselves with the latest cloud technologies. The knowledge gained was further shared with other application teams in the form of reusable patterns.

This journey is made easy by utilizing tools like IBM Cloud Transformation Advisor (CTA), IBM Mono2Micro and Konveyor*. CTA helps us with application and server discovery and points us in the direction towards what can we improve/ enhance and which platform would work best for us. IBM Mono2Micro helps the developer create a microservices architecture, regardless of skill level or technical knowledge. It breaks the monolith into partitions that can be the starting point for microservices, and provides code for it. And Konveyor is a CNCF foundation open source project that has multiple tools (like tackletest, forklift, etc.) that help with accelerating the modernization journey by providing paths towards rehosting, replatforming or refactoring.

The next set of blogs in this series will take you through a deep dive on how IBM Cloud Transformation Advisor and IBM Mono2Micro helped us accelerate the modernization in the IBM CIO.

Looking to the future

Application modernization is a continuous process, and application squads should not think that once they reach a defined milestone, the modernization has ended. High-performing software teams always look for opportunities to modernize and have a focus on how they will manage future technology and/or strategy evolution. This helps them adapt to changes faster and easier, depending on needs of the hour. It’s a journey; not a fixed destination. As with changing business needs, it's always on the move.

Learn more about WebSphere Hybrid Edition.

Discover how to increase WebSphere ROI.

Sachin Avasthi

STSM, Lead Architect for Application Modernization

Jay Talekar

STSM, App and Data Modernization

=======================

IBM Security QRadar SIEM: What a Leading Security Information and Event Management Solution Can Do Security

5 min read

By:

Jackie Lehmann, Program Director, Product Marketing, QRadar

How a leading SIEM solution like IBM Security QRadar can accelerate your threat detection and investigation. [1]

With cybersecurity threats on the rise, it’s important to ensure your organization has a full view of your environment. A threat detection and response solution can generate high-fidelity alerts that allow security analysts to focus on what really matters and respond quickly and effectively.

According to the X-Force Threat Intelligence Index 2023, the most common threat actions on objective were the deployment of backdoors (21%), ransomware (17%) and business email compromise (6%). While backdoor deployments—which enable remote access to systems—were the most common type of attacker action, the silver lining is that 67% of the backdoor cases failed to advance as ransomware attacks as defenders were able to disrupt the backdoor before ransomware was deployed.

IBM Security® QRadar® SIEM enables analysts to monitor cloud environments alongside the rest of your security enterprise data to provide prioritized high-fidelity alerts with real-time threat detection using the latest threat intelligence and built-in use cases (rules). In this demo blog, I will walk you through how a security analyst would typically investigate a threat found by QRadar SIEM and designate it for remediation.

Monitoring dashboard for potential threats

The fastest and easiest way to get started is by focusing on the threats that matter most using the Offenses tab:

This overview dashboard above shows key stats about the current alerts in this company’s IT environment (which are called “offenses” in QRadar SIEM). Looking at the table of offenses, the first column shows them by either priority or magnitude score. Offenses are created by QRadar's automated threat detection processes, which analyzes events in near real-time to discover what is happening. QRadar SIEM can analyze events from two types of sources:

  1. Logs: These are events that happened at a specific point in time and are written to a log file by an application. QRadar SIEM can analyze logs files from over 700 data sources.
  2. Network flows or flows: These are network activities between two hosts on a network. They are captured by QRadar SIEM’s built-in Network Detection and Response (NDR) add-on. Flows are more reliable than log data since they represent actual real-time data and cannot be modified.

Now, let me filter the offenses assigned to me and kick off my investigation. 

Investigating and correlating multiple events

From the Offenses overview page, I can see everything that QRadar SIEM has correlated and prioritized:

If I look at the offense description, I can see two things. First, this looks like a potential insider threat—it is definitely something I want to click on to investigate further:

The second thing I notice is the event chaining. The description demonstrates how QRadar is not only analyzing individual suspicious log events, but also comparing them to other events and collecting and correlating them into a single offense.

Assessing the magnitude of an offense

While investigating, I can also see key details about this offense in the screen above. These include what the source and destination IPs were, which MITRE ATT&CK tactics and techniques have been detected, and what use cases were triggered in relation to this offense, as well as the magnitude score breakdown. 

The magnitude score is how QRadar SIEM uniquely calculates the offense priority, which helps the security analyst focus on the most important offenses first. As shown in the screen below, it is comprised of three factors:

  1. Credibility: How much do I trust the source? (20% of magnitude score)
  2. Relevance: How pertinent this will be specifically to my environment? (50% of magnitude score)
  3. Severity: How bad this will be if it actually occurs? (30% of magnitude score) 

This offense has a magnitude score of five, which is a medium offense. So, I’ll want to continue the investigation by reviewing the events. Let’s see if QRadar SIEM found a username associated with any of this.

Searching and filtering events

By clicking on the events, QRadar SIEM shows me the query builder tool where I see a populated view of the events associated with this offense.  Here I can continue to drill into the events, filter them or, if needed, modify the AQL query to expand or narrow the number events.

I’m now going to use some of the quick-filter capabilities on the left to see if any usernames have been detected within the events related to this offense. I can see that there are a few names. Let’s take a look at the user, JBlue:

As seen above, I can now view JBlue’s activity. I immediately notice privilege escalation, failed and successful logins, and so on. It looks like JBlue is up to something. I’m going to pivot into our User Behavior Analytics (UBA) tool to do some more digging on JBlue.

Running User Behavior Analytics (UBA)

On the UBA Overview page below, I can see information specifically around insider threat risks in this organization’s environment. Let’s break this page down a little further:

Over on the left is the risky Monitored users list. Here, UBA ranks by highest risk scores (which are based on several analytics in QRadar SIEM). Multiple dedicated machine learning (ML) models determine normal versus anomalous behavior for each user based on their own activity and that of their learned peer group. The peer group ML model helps UBA detect behavior outside of what is deemed normal peer group behavior. 

Across the top is the status of my environment, how many users are being monitored and how UBA found them (imported users or discovered through event analysis). I can import users through traditional CSV (Comma Separated Values) or using LDAP (Lightweight Directory Access Protocol) to identify a user based on their related attributes across log sources. In the area at the top, far right, UBA shows how many of the UBA-related rules (use cases) I currently have turned on, and the status of my ML models. 

So now, let's double-click to find out what’s going on with the user JBlue.

Quickly completing an investigation to trigger an effective response

Below is JBlue’s User details page, which provides me with the details on what JBlue has been doing:

Right away, I see two use cases on the screen below that matched for this session, which is a red flag to me:

When I can click on this first use case—"dormant account used"—I see there are TCP events associated with a German IP. For this scenario, I know this company does not do business there and that JBlue is a programmer in our Colorado Springs office.

Now I can add my notes and notify the response team to kick off the required remediation steps, perhaps with the help of a Security Orchestration and Automated Response (SOAR) solution like IBM Security QRadar SOAR:

Conclusion

We’ve just demonstrated how QRadar SIEM can help with real-time threat detection and, at the same time, ease the security analyst workload. For more information about what IBM Security QRadar SIEM can do for your business, you can request a demo or check our webpage.

Read the full 2023 IBM Security X-Force Threat Intelligence Index and view the Threat Intelligence Index Action Guide for insights, recommendations and next steps.

 

[1] Gartner 2022 Gartner® Magic Quadrant™ for Security Information and Event Management, October 2022

Jackie Lehmann

Program Director, Product Marketing, QRadar

=======================

Evolving the IBM Storage Portfolio Brand Identity and Strategy Cloud Storage

4 min read

By:

Denis Kennelly, General Manager, IBM Storage

Exciting news about the future of IBM Storage.

Over the past year, we have been on a journey of making strategic shifts in our IBM Storage software portfolio—embracing open source and addressing the most critical data challenges facing every enterprise today while simplifying our brand and consumption models. 

Earlier this year, we combined the Red Hat Storage software portfolio with IBM Storage to bring together one of the most comprehensive Storage Software portfolios in the industry. We are investing in industry-leading, open-source-based CEPH as the foundation for our software-defined storage platform.

We see the following three things as the most pressing data challenges our customers face: 

  • Adopting AI/ML/high-performance computing workloads and contending with hyper-data growth.
  • Creating an information and application supply chain that enables those digital assets the freedom to move from edge-to-core-to-cloud(s).
  • Protecting the business’ critical and operational workloads from the reputational impacts and data loss associated with data breaches (intentional or accidental).

The number of storage products IBM offers is extensive, serving a very wide range of audiences who have a broad set of needs and use cases. At the same time, we are constantly seeking ways to enhance and simplify their experiences while providing flexible consumption models.

To that end, we’re making a product branding change to ensure our portfolio continues to be easy for our customers to navigate and understand while addressing our customers’ most pressing data challenges.

Simplifying our brand expression: “IBM Spectrum” becomes “IBM Storage”

IBM Storage engaged in a thorough process that involved studying marketing research, collaborating with experts and, most importantly, consulting with our clients. The result was to simplify our offerings by reducing the number of products we promote to a more manageable number with a bias on solution-led conversations that are software-defined, open and consumable as a service. Also, it was important to make it plainly obvious that IBM is very much in the storage business. This led to a decision to drop “Spectrum” in favor of what we do incredibly well: “Storage.”

But a name change wasn’t enough, we also needed to orient the full innovative might of our engineering, product and marketing efforts behind the most pressing data challenges our customers face, including the adoption of AI at enterprise scale, the digital transformation of application and services from edge-to-core-to-cloud(s), and the protection of business’ critical and operational workloads from the reputational impacts and data loss associated with data breaches (intentional or accidental).

This decision drove the framework in Figure 1:

Address AI and hyper-data growth with IBM Storage for Data and AI

IBM Storage for Data and AI is designed to help customers get the most from their application and information supply chain to improve business outcomes. It unlocks the latent value in your data to fast-track innovation and business results—allowing our clients to eliminate their data ingest and aggregation challenges, increase data relevancy and enable faster data analysis at scale.

This solution is based on the following:

  • IBM Storage Scale: A scale-out file and object software-defined storage platform designed for AI, ML and high-performance computing workloads.
  • IBM Storage Ceph: A unified and open-source software-defined storage platform designed to address the block, file and object needs of general-purpose workloads.
  • IBM Storage Scale System (formerly known as the IBM Elastic Storage System): An all-flash and hybrid elastic compute and storage appliance designed to create highly performant clusters for IBM Storage Scale in a sustainable IT architecture.
Connect edge-to-core-to-cloud(s) with IBM Storage for Hybrid Cloud

IBM Storage for Hybrid Cloud is designed to help customers modernize their application stack, build next-generation applications and innovate faster with data orchestration services for Red Hat OpenShift. It empowers you to deploy cloud architectures on-premises and extend them seamlessly to public cloud environments. Our goal is to help our customers to stop maintaining and start innovating by taking control of their hybrid cloud environments—making IT more agile, scalable, secure, efficient and cost-effective for their stateful container environment and portable workloads.

This solution is based on the following:

  • IBM Storage Fusion: A software-defined storage platform with flexible deployment options that delivers data orchestration services to efficiently store, protect, manage, govern, mobilize and integrate data and applications hosted in Red Hat OpenShift environments.
  • IBM Storage Fusion HCI System: A purpose-built hyper-converged infrastructure that simplifies the design and deployment of Red Hat OpenShift with container-native data orchestration and storage services to enable faster and consistent Kubernetes architecture deployments.
Enjoy resiliency by default, resiliency by design with IBM Storage for Data Resiliency

IBM Storage for Data Resiliency is designed to help customers safeguard data from breaches and threats while reducing costs and downtime with resilient storage offerings. It integrates machine learning and automation with storage technology to detect anomalies and threats, speed recovery and optimize costs.

This solution is based on the following:

  • IBM Storage Defender: Reduce the threat exposure window from days to hours and proactively safeguard data and applications with a multi-faceted and scalable data resiliency offering that defends against cyber-vulnerabilities from detection through recovery with air-gapping capabilities that include logical, operational and physical separation from source data sets. Learn more about IBM Storage Defender in the announcement here.
Timing

This new approach began rolling out at the beginning of 2023, culminating into our Data and Innovation Summit today in New York City, NY.

I’m excited to share our updated identity and vision for IBM Storage with you. Our customers are at the heart of our IBM Storage brand, and we want to ensure you have insight into its evolution and why it is so important to reimagine what enterprise-class and software-defined storage should be in today’s market and for today’s modern business.

Learn more about IBM Storage solutions.

Denis Kennelly

General Manager, IBM Storage

=======================

Page 1|Page 2|Page 3|Page 4