EIR-OPS-005: Failsafe at Initial AOS


Objective

To confirm 2-way communication with EIRSAT-1 while in failsafe. To also assess the reason for, and subsequently leave, failsafe.


Introduction

The Operator should have been directed to this procedure from EIR-OPS-004: Initial AOS as the current boot image of the spacecraft at initial AOS was determined to be failsafe.

Using this procedure, the Operator will verify that the antennas are fully deployed prior to making the decision to finish the Separation Sequence. This part of the procedure is very similar to EIR-OPS-004: Initial AOS but has been tailored to suit the current boot image. The Operator will also be advised on how to assess what chain of events potentially led to failsafe since launch, and whether a primary image can/should be safely booted.

Note

Failsafe is not equipped with a Mode Manager. Therefore, as part of this procedure, rather than transitioning from Separation Sequence Mode to Commissioning Mode, the Operator will just ‘finish the Separation Sequence’, putting the Separation Sequence state machine into its finished state directly rather than via a Mode Manager.


Procedure

This procedure follows on from Section B of EIR-OPS-004: Initial AOS and contains the following sub-procedures:

Important

Communication with the spacecraft is required for Sections A, C and D of this procedure.


B. Data Analysis (After the Communication Pass)

Note

The analysis to be carried out by the team is very dependent on the findings as well as what data was successfully downlinked in Section A. Therefore, rather than a strict set of instructions, this section instead provides information to help guide the Operator in their analyses. Also note that in addition to any data downlinked by the UCD GS, data obtained via the amateur radio community may also be used to support the analysis/findings.

SPACECRAFT HEALTH CHECK

B.1.

  • Any ‘NEW rows of FAILSAFE HK data’ downlinked should now be checked to assess the current state of the spacecraft and its subsystems. Other than the fact that failsafe is the current boot image, do the other HK parameters cause any reason for concern? e.g:

    • Are the battery bus voltage levels nominal?

    • Are the various EPS and/or battery reset counters as expected given their pre-launch values?

    • Has the temperature of the CMC Power Amplifier stayed within expected/acceptable limits since RF transmissions were enabled?

Tip

This information should be used to assist with the ‘FAILSAFE BOOTED ANALYSIS’ below.

Tip

In addition to the most recent value of each parameter, check how the values changed with time. Use the Grafana to help with this.


B.2.

  • The Operator should also assess whether the failsafe image has been stable since booted. To do this:

    • If the full failsafe Event log has been downlinked, search it for occurrences of the Separation Sequence ‘StateFunctionComplete’ event with event data = 0x00 (i.e. the Separation Sequence Init State). If failsafe has been stable since booted, only one event with data = 0x00 should be observed.

    • If the full Event log has NOT YET been downlinked but some ‘NEW rows of FAILSAFE HK data’ and some ‘NEW rows of PRIMARY HK data’ were retrieved:

      • Use the most recent On-Board Time (OBT) and uptime parameter values in the ‘NEW rows of FAILSAFE HK data’ to determine the OBT of the last reboot.

      • If this OBT is roughly consistent with the last OBT parameter value in the ‘NEW rows of PRIMARY HK data’, then failsafe has likely been stable since booted.

  • If multiple reboots have occurred since failsafe has been booted, the Operator should investigate this in parallel to the below analysis, which is more focused on the nature of the reboots that led to failsafe as opposed to reboots while operating in failsafe. However, the same analysis largely applies and should be considered prior to proceeding to Section D.


2-WAY COMMUNICATION CONFIRMATION

B.3.

  • The downlinked data should now be assessed to confirm with confidence, that:

    1. full antenna deployment has occurred, and

    2. nominal 2-way communication have been achieved.

  • To do this, the following should be considered:

    • Does the downlinked Event log (i.e. the ‘OLD rows of FAILSAFE Event data’) suggest that the Separation Sequence successfully progressed to and through the different burn and between-burn-wait states (i.e. are Separation Sequence ‘StateFunctionComplete’ events observed with Event Data = the IDs of the burn and between-burn-wait states)?

      Tip

      To aid this assessment, the Operator can review Event log data downlinked during the MMTs here for comparison with their data.

    • In the ‘OLD rows of ADM data’:

      • Do the ADM switch states, read by both the OBC and the EMOD MSP (i.e. mission.SeparationSequence.AntSwitchesStatuses and platform.ADM.SwitchesStatuses ), indicate that the antenna elements have been deployed?

      • Do the deployment times of the different elements coincide with the resistor burns?

      • Do the PDM currents show that the correct current went through the resistors for the correct amount of time during the resistor burns?

    • In the downlinked HK data:

      • Check that the temperature of the CMC Power Amplifier increased only after RF transmissions were enabled to confirm that RF transmissions enabled when expected.


B.4.

  • When the team are satisfied that all antenna elements are fully deployed and that 2-way communications are stable, during the next communication pass, Section C should be carried out.


FAILSAFE BOOTED ANALYSIS

B.6.

  • The Operator should first assess the time-line of the reboot(s) that led to failsafe. To do this, take note of the most recent core.OBT.uptime in the ‘NEW rows of PRIMARY HK data’, and consider the following possibilities:

    • If there are no rows of ‘PRIMARY HK data’ available for downlink, a failed attempt to boot the primary image at start-up likely led to failsafe being booted. This theory is supported if there are also no rows of ‘PRIMARY Event data’.

      If this is the case, the Operator should now consider what might have prevented a successful boot. The remaining steps in this section are not very applicable to this analysis and so, the Software Engineer should be contacted for support.

    • If this core.OBT.uptime is >2 hours, failsafe was booted as a result of:

      • A reboot + a failed attempt to boot back into the previously operating primary image, or

      • A reboot where the primary image was not marked as stable even though >2 hours of operating in the image had passed.

      Both scenarios require an assessment of the initial reboot. Additionally, however, both scenarios also require some anomalous/unexpected (software?) behaviour. Therefore, if either scenario has occurred, the Software Engineer should be contacted for support.

    • If this core.OBT.uptime is <2 hours AND >2 hours had elapsed since on-orbit deployment, failsafe was booted as a result of more than one reboot sometime after launch, where the first rebooted the primary image.

      In this case, ‘NEW rows of PRIMARY HK data’ and ‘NEW rows of PRIMARY Event data’ should be searched for further evidence of the first reboot into the primary image (e.g. did uptime reset?, are there multiple occurrences of the Separation Sequence StateFunctionComplete event with event data = 0x00?).

    • If this core.OBT.uptime is <2 hours AND <2 hours had elapsed since on-orbit deployment, failsafe was booted as a result of a single reboot sometime after launch.


B.7.

  • To determine the nature of any reboots identified in the previous step, the Operator should now search the Event logs (i.e. the ‘OLD rows of FAILSAFE Event data’ and ‘NEW rows of PRIMARY Event data’) for ‘EPSInitialised’ events around the times of the reboots.

  • If this event is observed, a full spacecraft power-cycle led to the reboot.

  • Else, an OBC reset occurred.


B.8.

  • If a full spacecraft power-cycle occurred, the Operator should now assess the ‘NEW rows of PRIMARY HK data’ and ‘NEW rows of PRIMARY Event data’ to determine if there is evidence that low battery conditions caused the reboot(s). In particular, the Operator should:

    • Search the HK data for a decrease in the battery bus voltage to ~6.144V, and

    • Search the Event log for the ‘LowVoltageExceptionBATSafe’ event.

    If evidence that low battery conditions caused the reboot(s) is found, the Operator should now consider using the EIR-OPS-026: Low Battery Fault Analysis procedure to assist further analysis.

  • If a full spacecraft power-cycle did not occur OR if a power-cycle did occur but there is no evidence of low battery issues, the Operator should now consider using the EIR-OPS-027: Reboot Fault Analysis procedure to assist further analysis.


B.9.

  • When the team have completed their analysis and wish to leave the failsafe image, Section D should be carried out.



C. Finishing the Separation Sequence


C.1.

  • Invoke the mission.SeparationSequence.SeparationSequenceFinish action.

TC Details

MCS Operation

Invoke

Action/Param Name

mission.SeparationSequence.SeparationSequenceFinish

Data Expected with TC

No

TM Details

Data Expected from TC

No ( + ACK )


C.2.

  • Get the mission.SeparationSequence.state parameter.

  • Ensure that the returned state is 0x42 (hex) / 66 (dec).

TC Details

MCS Operation

Get

Action/Param Name

mission.SeparationSequence.state

Data Expected with TC

No

TM Details

Data Expected from TC

state ( + ACK )

Data Size

1 byte

Data Info

the current state of the Separation Sequence

Allowed Value(s)

00 - 09 or 42 (hex)

Expected Value(s)

42 (hex) / 66 (dec)


C.3.

  • On exit of the Separation Sequence, all PDMs should be powered OFF. To confirm this, Get the platform.EPS.actualSwitchStates parameter with First row = 0 and Last row = 9.

  • Ensure that all 0s (excluding row 7/PDM 8) are returned.

Caution

The FSS is drawing parasitic power on row 7/PDM 8 of EPS.actualSwitchStates and so will always be returned as 1 (ON), even if the state of PDM8/Row7 of EPS.expectedSwitchStates is set to 0 (OFF).

TC Details

MCS Operation

Get

Action/Param Name

platform.EPS.actualSwitchStates

Data Expected with TC

Yes

Data Size

2 bytes, 2 bytes

Data Info

First row, Last row

Allowed Value(s)

0-9, 0-9

Expected Value(s)

0, 9

TM Details

Data Expected from TC

List of switch states ( + ACK )

Data Size

List[0:10] of Booleans

Data Info

If all 0, all PDMs are off

Allowed Value(s)

0000000000 (all PDMs OFF) - 1111111111 (all PDMs ON)

Expected Value(s)

0000000100 (all PDMs OFF, except for the FSS PDM/PDM 8)



D. Booting Primary

Warning

This section of the procedure should ONLY be carried out following the close-out of Sections B and C, and ONLY IF the decision has been made to proceed with booting into a primary image.

D.1.

  • The Operator should now follow the EIR-OPS-024: Boot Into OBC Image procedure to boot the primary image of choice (i.e. primary1 or primary2).

  • If the primary image is successfully booted and is stable (i.e. no reboots to failsafe), the Operator can begin the EIR-OPS-006: Commissioning Procedure.


END OF PROCEDURE