EC2Rescue is a diagnostic and troubleshooting tool provided by AWS to help users identify and troubleshoot common issues that may affect Amazon EC2 instances. It is designed to assist with diagnosing and resolving various system issues, including network configuration problems, Operating System issues, and disk configuration issues.

The EC2Rescue tool provides the following functionalities:

Protect Your Data with BDRSuite

Cost-Effective Backup Solution for VMs, Servers, Endpoints, Cloud VMs & SaaS applications. Supports On-Premise, Remote, Hybrid and Cloud Backup, including Disaster Recovery, Ransomware Defense & more!

Logs Collection: It helps in collecting various logs from the instance, including system logs, application logs, and other diagnostic information, which can be useful for identifying the root cause of issues.

File System Checks: It performs checks on the file system of the instance to identify any potential disk issues or file system errors.

Network Configuration Checks: EC2Rescue helps to diagnose common network configuration problems and suggests potential solutions to resolve networking issues.

Download Banner

Operating System Checks: It checks for common operating system issues that may affect the performance or stability of the EC2 instance.

Security Checks: The tool can help identify potential security vulnerabilities or misconfigurations that may impact the security posture of the EC2 instance.

How to install the EC2Rescue tool:

  1. Download the EC2Rescue tool
  2. [root@ip-172-31-7-187 ~]# curl -O https://s3.amazonaws.com/ec2rescuelinux/ec2rl.tgz
    % Total % Received % Xferd Average Speed Time Time Time Current
    Dload Upload Total Spent Left Speed
    100 7499k 100 7499k 0 0 21.1M 0 –:–:– –:–:– –:–:– 21.1M
    [root@ip-172-31-7-187 ~]#

  3. Extract the EC2Rescue package
  4. [root@ip-172-31-7-187 ~]# tar -xf ec2rl.tgz
    [root@ip-172-31-7-187 ~]# cd ec2rl-*/
    [root@ip-172-31-7-187 ec2rl-1.1.6]# ls -lrt
    total 100
    -rw-rw-r–. 1 root root 4154 Sep 7 08:16 functions.bash
    -rw-rw-r–. 1 root root 4166 Sep 7 08:16 ec2rl.py
    -rwxrwxr-x. 1 root root 1590 Sep 7 08:16 ec2rl
    -rw-rw-r–. 1 root root 6261 Sep 7 08:16 README.md
    -rw-rw-r–. 1 root root 899 Sep 7 08:16 NOTICE
    -rw-rw-r–. 1 root root 15758 Sep 7 08:16 LICENSE
    -rw-rw-r–. 1 root root 369 Sep 7 08:16 requirements.txt
    drwxr-xr-x. 2 root root 69 Sep 7 08:16 ssmdocs
    drwxr-xr-x. 2 root root 69 Sep 7 08:16 pre.d
    drwxr-xr-x. 2 root root 37 Sep 7 08:16 post.d
    drwxr-xr-x. 2 root root 16384 Sep 7 08:16 mod.d
    drwxr-xr-x. 2 root root 60 Sep 7 08:16 example_modules
    drwxr-xr-x. 2 root root 16384 Sep 7 08:16 example_configs
    drwxr-xr-x. 3 root root 54 Sep 7 08:16 docs
    drwxr-xr-x. 3 root root 16384 Nov 13 07:25 ec2rlcore
    drwxr-xr-x. 10 root root 147 Nov 13 07:26 lib
    [root@ip-172-31-7-187 ec2rl-1.1.6]#

  5. Check the version of the EC2Rescue tool
  6. [root@ip-172-31-7-187 ec2rl-1.1.6]# ./ec2rl version
    ec2rl 1.1.6
    Copyright 2016-2020 Amazon.com, Inc. or its affiliates. All rights reserved.
    This software is distributed under the Apache License, Version 2.0.

    This file is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    [root@ip-172-31-7-187 ec2rl-1.1.6]#

How to use the EC2Rescue tool:

  1. You can run the EC2Rescue tool to check any issues on your current instance
  2. [root@ip-172-31-7-187 ec2rl-1.1.6]# ./ec2rl run
    ———–[Backup Creation]———–
    No backup option selected. Please consider backing up your volumes or instance
    ———-[Configuration File]———-
    Configuration file saved:
    /var/tmp/ec2rl/2023-11-13T07_26_40.955988/configuration.cfg
    ————-[Output Logs]————-
    The output logs are located in:
    /var/tmp/ec2rl/2023-11-13T07_26_40.955988
    ————–[Module Run]————–
    Running Modules:
    arptable, blkid, cgroups, clocksource, cpuinfo, *************, lvmarchives, lvmconf, messages, sarhistory, sysctlconf, systemsmanager, udev, yumlog
    ———-[Diagnostic Results]———-
    module run/openssh [FAILURE] Improper configuration of one or more OpenSSH components.
    — SSH may deny access to users when improperly configured.
    — FAILURE Permission mode includes permissions for groups and/or other users: /etc/ssh/ssh_host_ed25519_key
    — Adjust permissions: sudo chmod 600 /etc/ssh/ssh_host_ed25519_key
    — FAILURE Permission mode includes permissions for groups and/or other users: /etc/ssh/ssh_host_ecdsa_key
    — Adjust permissions: sudo chmod 600 /etc/ssh/ssh_host_ecdsa_key
    module run/arpcache [SUCCESS] Aggressive arp caching is disabled.
    module run/arpignore [SUCCESS] arp ignore is disabled for all interfaces.
    module run/asymmetricroute [SUCCESS] No duplicate subnets found.
    module run/conntrackfull [SUCCESS] No conntrack table full errors found.
    module run/consoleoverload [SUCCESS] No serial console overload found
    module run/duplicatefslabels [SUCCESS] No duplicate filesystem labels found.
    module run/duplicatefsuuid [SUCCESS] No duplicate filesystem UUIDs found.
    module run/duplicatepartuuid [SUCCESS] No duplicate partition UUIDs found.
    module run/hungtasks [SUCCESS] No hung tasks found
    module run/ixgbevfversion [SUCCESS] Not using ixgbevf driver.
    module run/kernelbug [SUCCESS] No kernel bug occurrences found
    module run/kerneldereference [SUCCESS] No kernel null pointer dereference ocurrences found
    module run/kernelpanic [SUCCESS] No kernel panic occurrences found
    module run/oomkiller [SUCCESS] No oom-killer invocations found
    module run/softlockup [SUCCESS] No CPU soft lockup occurrences found
    module run/tcprecycle [SUCCESS] Aggressive TCP recycling is disabled.
    module run/xennetrocket [SUCCESS] No SKB overflow bug found
    module run/xennetsgmtu [SUCCESS] Scatter-Gather is enabled on enX0. This mitigates the bug.
    module run/enadiag [WARN] Unable to run ENA stats module.
    ————–[Run Stats]————–
    Total modules run: 88
    ‘collect’ modules run: 42
    ‘gather’ modules run: 26
    ‘diagnose’ modules run: 20
    successes: 18
    failures: 1
    warnings: 1
    unknown: 0
    Modules not run due to missing: sudo | software | parameters | perf-impact
    0 | 10 | 62 | 7
    —————-[NOTICE]—————-
    Please note, this directory could contain sensitive data depending on modules run! Please review its contents!
    —————-[Upload]—————-
    You can upload results to AWS Support with the following, or run ‘help upload’ for details on using an S3 presigned URL:
    sudo ./ec2rl upload –upload-directory=/var/tmp/ec2rl/2023-11-13T07_26_40.955988 –support-url=”URLProvidedByAWSSupport”
    The quotation marks are required, and if you ran the tool with sudo, you will also need to upload with sudo.
    —————[Feedback]—————
    We appreciate your feedback. If you have any to give, please visit:
    https://aws.au1.qualtrics.com/jfe1/form/SV_3KrcrMZ2quIDzjn?InstanceID=i-04569f2fa0c58236a&Version=1.1.6
    [root@ip-172-31-7-187 ec2rl-1.1.6]#

  3. To fix the identified issues, you can use the following command. Review the findings before you try to remediate. [/var/tmp/ec2rl]
  4. [root@ip-172-31-7-187 ec2rl-1.1.6]# ./ec2rl run –remediate
    ———–[Backup Creation]———–
    No backup option selected. Please consider backing up your volumes or instance
    ———-[Configuration File]———-
    Configuration file saved:
    /var/tmp/ec2rl/2023-11-13T07_38_14.941190/configuration.cfg
    ————-[Output Logs]————-
    The output logs are located in:
    /var/tmp/ec2rl/2023-11-13T07_38_14.941190
    ————–[Module Run]————–
    Running Modules:
    arptable, blkid, cgroups, clocksource, cpuinfo, ***************, sysctlconf, systemsmanager, udev, yumlog
    ———-[Diagnostic Results]———-
    module run/arpcache [SUCCESS] Aggressive arp caching is disabled.
    module run/arpignore [SUCCESS] arp ignore is disabled for all interfaces.
    module run/asymmetricroute [SUCCESS] No duplicate subnets found.
    module run/conntrackfull [SUCCESS] No conntrack table full errors found.
    module run/consoleoverload [SUCCESS] No serial console overload found
    module run/duplicatefslabels [SUCCESS] No duplicate filesystem labels found.
    module run/duplicatefsuuid [SUCCESS] No duplicate filesystem UUIDs found.
    module run/duplicatepartuuid [SUCCESS] No duplicate partition UUIDs found.
    module run/hungtasks [SUCCESS] No hung tasks found
    module run/ixgbevfversion [SUCCESS] Not using ixgbevf driver.
    module run/kernelbug [SUCCESS] No kernel bug occurrences found
    module run/kerneldereference [SUCCESS] No kernel null pointer dereference ocurrences found
    module run/kernelpanic [SUCCESS] No kernel panic occurrences found
    module run/oomkiller [SUCCESS] No oom-killer invocations found
    module run/openssh [SUCCESS] All configuration checks passed or all detected problems fixed.
    — FIXED Permission mode includes permissions for groups and/or other users: /etc/ssh/ssh_host_ed25519_key
    — FIXED Permission mode includes permissions for groups and/or other users: /etc/ssh/ssh_host_ecdsa_key
    module run/softlockup [SUCCESS] No CPU soft lockup occurrences found
    module run/tcprecycle [SUCCESS] Aggressive TCP recycling is disabled.
    module run/xennetrocket [SUCCESS] No SKB overflow bug found
    module run/xennetsgmtu [SUCCESS] Scatter-Gather is enabled on enX0. This mitigates the bug.
    module run/enadiag [WARN] Unable to run ENA stats module.

    ————–[Run Stats]————–

    Total modules run: 88
    ‘collect’ modules run: 42
    ‘gather’ modules run: 26
    ‘diagnose’ modules run: 20
    successes: 19
    failures: 0
    warnings: 1
    unknown: 0

    Modules not run due to missing: sudo | software | parameters | perf-impact
    0 | 10 | 62 | 7
    —————-[NOTICE]—————-
    Please note, this directory could contain sensitive data depending on modules run! Please review its contents!
    —————-[Upload]—————-
    You can upload results to AWS Support with the following, or run ‘help upload’ for details on using an S3 presigned URL:
    sudo ./ec2rl upload –upload-directory=/var/tmp/ec2rl/2023-11-13T07_38_14.941190 –support-url=”URLProvidedByAWSSupport”
    The quotation marks are required, and if you ran the tool with sudo, you will also need to upload with sudo.
    —————[Feedback]—————
    We appreciate your feedback. If you have any to give, please visit:
    https://aws.au1.qualtrics.com/jfe1/form/SV_3KrcrMZ2quIDzjn?InstanceID=i-04569f2fa0c58236a&Version=1.1.6
    [root@ip-172-31-7-187 ec2rl-1.1.6]#

How to use the EC2Rescue tool on impaired instances:

To troubleshoot unreachable Amazon EC2 instances or unbootable Amazon EC2 instances, you can try the following.

1. Let’s recover one of the failed EC2 instances for this demo. Currently instance is in the stopped state.

install the EC2Rescue tool

2. Let’s detach the root volume.

install the EC2Rescue tool

3. Let’s attach the volume to the instance that is currently accessible. Click on “Attach Volume” and select the instance to attach.

install the EC2Rescue tool

4. Once the volume is attached, you would be able to see the new disk on the Linux instance by running the following command.

[root@ip-172-31-7-187 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
xvda 202:0 0 8G 0 disk
├─xvda1 202:1 0 8G 0 part /
├─xvda127 259:0 0 1M 0 part
└─xvda128 259:1 0 10M 0 part /boot/efi
xvdf 202:80 0 8G 0 disk
├─xvdf1 202:81 0 8G 0 part
├─xvdf127 259:2 0 1M 0 part
└─xvdf128 259:3 0 10M 0 part
[root@ip-172-31-7-187 ~]#

5. Create a new mount point to mount the root disk of the corrupted instance.

[root@ip-172-31-7-187 ~]# mkdir /web1-rootdisk
[root@ip-172-31-7-187 ~]#

6. Let’s mount the volume using the following command.

[root@ip-172-31-7-187 ~]# mount -t xfs /dev/xvdf1 /web1-rootdisk
mount: /web1-rootdisk: wrong fs type, bad option, bad superblock on /dev/xvdf1, missing codepage or helper program, or other error.
[root@ip-172-31-7-187 ~]#

[root@ip-172-31-7-187 ~]# dmesg |grep xvdf1
[34052.328964] xvdf: xvdf1 xvdf127 xvdf128
[34189.932781] XFS (xvdf1): Filesystem has duplicate UUID 3f35b7b7-3b0c-4802-85f2-d1e990bef1d5 – can’t mount
[34237.411117] XFS (xvdf1): Filesystem has duplicate UUID 3f35b7b7-3b0c-4802-85f2-d1e990bef1d5 – can’t mount

To make the mount successful,

[root@ip-172-31-7-187 ~]# mount -o nouuid /dev/xvdf1 /web1-rootdisk
[root@ip-172-31-7-187 ~]#

7. Run the following special command to switch to the instance as root.

[root@ip-172-31-7-187 ~]# for i in proc sys dev run; do mount –bind /$i /web1-rootdisk/$i ; done
[root@ip-172-31-7-187 ~]#

8. Do the “chroot” to the instance using the following command.

[root@ip-172-31-7-187 ~]# chroot /web1-rootdisk
[root@ip-172-31-7-187 /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvdf1 8.0G 1.5G 6.5G 19% /
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 190M 2.9M 188M 2% /run
[root@ip-172-31-7-187 /]#

9. Perform the “How to install the EC2Rescue tool” steps. Ensure you have “ec2rl” available for execution. Run the tool to identify the issues on the instance.

[root@ip-172-31-7-187 ec2rl-1.1.6]# ./ec2rl run
———–[Backup Creation]———–
No backup option selected. Please consider backing up your volumes or instance
———-[Configuration File]———-
Configuration file saved:
/var/tmp/ec2rl/2023-11-13T17_02_33.077167/configuration.cfg
————-[Output Logs]————-
The output logs are located in:
/var/tmp/ec2rl/2023-11-13T17_02_33.077167
————–[Module Run]————–
Running Modules:
arptable, blkid, cgroups, clocksource, cpuinfo, date,****************sysctlconf, systemsmanager, udev, yumlog
———-[Diagnostic Results]———-
module run/duplicatefslabels [FAILURE] Duplicate label, /, found on the following filesystems: /dev/xvda1, /dev/xvdf1
module run/duplicatefsuuid [FAILURE] Duplicate UUID, 452A-C785, found on the following filesystems: /dev/xvda128, /dev/xvdf128
module run/duplicatepartuuid [FAILURE] Duplicate UUID, cb4117a3-2893-4ee1-ac2e-a7e9b717d368, found on the following partitions: /dev/xvda128, /dev/xvdf128
module run/openssh [FAILURE] Improper configuration of one or more OpenSSH components.
— SSH may deny access to users when improperly configured.
— FAILURE Permission mode includes permissions for groups and/or other users: /etc/ssh/ssh_host_ed25519_key
— Adjust permissions: sudo chmod 600 /etc/ssh/ssh_host_ed25519_key
— FAILURE Permission mode includes permissions for groups and/or other users: /etc/ssh/ssh_host_ecdsa_key
— Adjust permissions: sudo chmod 600 /etc/ssh/ssh_host_ecdsa_key
module run/arpcache [SUCCESS] Aggressive arp caching is disabled.
module run/arpignore [SUCCESS] arp ignore is disabled for all interfaces.
module run/asymmetricroute [SUCCESS] No duplicate subnets found.
module run/conntrackfull [SUCCESS] No conntrack table full errors found.
module run/consoleoverload [SUCCESS] No serial console overload found
module run/hungtasks [SUCCESS] No hung tasks found
module run/ixgbevfversion [SUCCESS] Not using ixgbevf driver.
module run/kernelbug [SUCCESS] No kernel bug occurrences found
module run/kerneldereference [SUCCESS] No kernel null pointer dereference ocurrences found
module run/kernelpanic [SUCCESS] No kernel panic occurrences found
module run/oomkiller [SUCCESS] No oom-killer invocations found
module run/softlockup [SUCCESS] No CPU soft lockup occurrences found
module run/tcprecycle [SUCCESS] Aggressive TCP recycling is disabled.
module run/xennetrocket [SUCCESS] No SKB overflow bug found
module run/xennetsgmtu [SUCCESS] Scatter-Gather is enabled on enX0. This mitigates the bug.
module run/enadiag [WARN] Unable to run ENA stats module.
————–[Run Stats]————–

Total modules run: 88
‘collect’ modules run: 42
‘gather’ modules run: 26
‘diagnose’ modules run: 20
successes: 15
failures: 4
warnings: 1
unknown: 0
Modules not run due to missing: sudo | software | parameters | perf-impact
0 | 10 | 62 | 7
—————-[NOTICE]—————-
Please note, this directory could contain sensitive data depending on modules run! Please review its contents!
—————-[Upload]—————-
You can upload results to AWS Support with the following, or run ‘help upload’ for details on using an S3 presigned URL:
sudo ./ec2rl upload –upload-directory=/var/tmp/ec2rl/2023-11-13T17_02_33.077167 –support-url=”URLProvidedByAWSSupport”
The quotation marks are required, and if you ran the tool with sudo, you will also need to upload with sudo.
—————[Feedback]————–
We appreciate your feedback. If you have any to give, please visit:
https://aws.au1.qualtrics.com/jfe1/form/SV_3KrcrMZ2quIDzjn?InstanceID=i-04569f2fa0c58236a&Version=1.1.6
[root@ip-172-31-7-187 ec2rl-1.1.6]#

10. Fix the issues by running remediate option.

[root@ip-172-31-7-187 ec2rl-1.1.6]# ./ec2rl run –remediate
———–[Backup Creation]———–
No backup option selected. Please consider backing up your volumes or instance
———-[Configuration File]———-
Configuration file saved:
/var/tmp/ec2rl/2023-11-13T17_05_54.231545/configuration.cfg
————-[Output Logs]————-
The output logs are located in:
/var/tmp/ec2rl/2023-11-13T17_05_54.231545
————–[Module Run]————–

11. Exit from the chroot and unmount the filesystems.

[root@ip-172-31-7-187 ec2rl-1.1.6]# exit
exit
[root@ip-172-31-7-187 ~]# umount /web1-rootdisk/{proc,sys,dev,run,}
[root@ip-172-31-7-187 ~]#

12. Detach the attached volume from AWS console and attach it back to the impaired instance with the device name “/dev/xvda”. Let’s start the instance and try to access it.

install the EC2Rescue tool

We have successfully recovered the failed EC2 instance using the EC2Rescue tool.

Conclusion

By using EC2Rescue, users can streamline the troubleshooting process and quickly identify and resolve issues that may be affecting the performance or functionality of their EC2 instances. It is a valuable tool for AWS users, particularly for those who may not have extensive experience with diagnosing and troubleshooting issues in AWS environments. EC2Rescue helps simplify the process of diagnosing and resolving issues, allowing users to maintain the health and performance of their EC2 instances effectively.

Related Posts:

AWS for Beginners: EC2 instance connect for SSH connectivity without public IP – Part 46

Follow our Twitter and Facebook feeds for new releases, updates, insightful posts and more.

Rate this post