EIR-OPS-032: Separation Sequence Restarted
Objective
To confirm why the mission’s Separation Sequence has restarted when it has previously been exited/finished in the current boot image and to regain nominal operations.
Introduction
Using this procedure, the Operator will downlink data from the spacecraft to determine the most likely cause of this unexpected first-boot state. Identification of the most likely root cause will allow the Operator to properly assess what further analysis/health checks should then performed prior to resuming nominal operations.
Note
As this is a failure/unexpected scenario, this procedure is a guide of the parameters and questions to consider rather than a strict set of instructions. The procedure and data downlinked should be tailored to suit the issues identified.
Procedure
This procedure contains the following sub-procedures:
Important
Communication with the spacecraft is required for Sections A - D and F. Once Sections A and B are completed, Sections C and D should be referred to for each communication pass. Data downlink should be the priority of the Operator during each pass while trying to establish the cause of the anomaly. The analysis detailed in Section E of this procedure should be performed outside of the communication pass.
A.I. Exiting the Separation Sequence (Primary image booted)
Tip
Follow this Section of the procedure if currBootImage = 1 or 2. Otherwise, proceed to Section A.II now.
A.I.1.
Exit Separation Sequence Mode and enter Commissioning Mode using EIR-OPS-007: Operational Mode Change .
A.I.2.
Invoketheplatform.EPS.turnOffEMODaction to turn off EMOD/PDM 10.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
No ( + ACK ) |
A.I.3.
To ensure all PDMs are now off,
Getthe parameterplatform.EPS.actualSwitchStateswithFirst Row= 0 andLast Row= 9.Confirm that 0 is returned for all PDMs/rows (excluding PDM 8/row 7).
Warning
PDM 8 is drawing parasitic power. Therefore, when you Get the platform.EPS.actualSwitchStates parameter it will always read as ON/1 even when it is powered OFF.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
4 bytes, 4 bytes |
Data Info |
|
Allowed Value(s) |
0-9, 0-9 |
Expected Value(s) |
0, 9 |
TM Details |
|
Data Expected from TC |
|
Data Size |
10 bits/list of 10 booleans |
Data Info |
List of PDM switch states |
Allowed Value(s) |
0 (PDM off) or 1 (PDM on) |
Expected Value(s) |
0 for all PDMs except PDM 8/row 7 |
A.I.4.
Gettheplatform.ADCS.adcsModeStateparameter.Confirm that the ADCS is operating in Standby Mode/Nadir State (i.e. that 0x0000 is returned).
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
|
Data Size |
4 bytes |
Data Info |
The current mode (2 MSB) and state (2 LSB) of the ADCS |
Allowed Value(s) |
See tables below |
Expected Value(s) |
00000000 (hex) |
Where…
|
ADCS Mode |
|---|---|
0000 |
Standby (Default) |
0001 |
Detumble |
0002 |
Spin Stabilised |
5550 |
Test |
|
ADCS State |
|---|---|
0000 |
Nadir (Default) |
AAA8 |
Test |
A.II. Exiting the Separation Sequence (Failsafe image booted)
Tip
If Section A.I was previously followed, skip ahead to Section B now. Otherwise, proceed with Step 1.A.II. If currBootImage being 0/failsafe is unexpected, it should be noted that this procedure only intends to help the Operators assess the cause of the first-boot scenario, and does not address why failsafe might be the current boot image. To assess the latter, the Operator should refer to EIR-OPS-029: Failsafe Entered after or in parallel to completing this procedure.
A.II.1.
Invokethemission.SeparationSequence.SeparationSequenceFinishaction.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
No ( + ACK ) |
A.II.2.
Getthemission.SeparationSequence.stateparameter.Confirm that the Separation Sequence is in its finished state (i.e. 0x42).
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
|
Data Size |
1 byte |
Data Info |
the current state of the Separation Sequence |
Allowed Value(s) |
00 - 09 or 42 (Hex) |
Expected Value(s) |
0x42 (Hex) |
A.II.3.
To ensure all PDMs are now off,
Getthe parameterplatform.EPS.actualSwitchStateswithFirst Row= 0 andLast Row= 9.Confirm that 0 is returned for all PDMs/rows (excluding PDM 8/row 7).
Warning
PDM 8 is drawing parasitic power. Therefore, when you Get the platform.EPS.actualSwitchStates parameter it will always read as ON/1 even when it is powered OFF.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
4 bytes, 4 bytes |
Data Info |
|
Allowed Value(s) |
0-9, 0-9 |
Expected Value(s) |
0, 9 |
TM Details |
|
Data Expected from TC |
|
Data Size |
10 bits/list of 10 booleans |
Data Info |
List of PDM switch states |
Allowed Value(s) |
0 (PDM off) or 1 (PDM on) |
Expected Value(s) |
0 for all PDMs except PDM 8/row 7 |
A.II.4.
Gettheplatform.ADCS.adcsModeStateparameter.Confirm that the ADCS is operating in Standby Mode/Nadir State (i.e. that 0x0000 is returned).
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
|
Data Size |
4 bytes |
Data Info |
The current mode (2 MSB) and state (2 LSB) of the ADCS |
Allowed Value(s) |
See tables below |
Expected Value(s) |
00000000 (hex) |
Where…
|
ADCS Mode |
|---|---|
0000 |
Standby (Default) |
0001 |
Detumble |
0002 |
Spin Stabilised |
5550 |
Test |
|
ADCS State |
|---|---|
0000 |
Nadir (Default) |
AAA8 |
Test |
B. Confirmation of Repeated First-Boot
B.1.
To establish whether this is in-fact a repeated first-boot scenario or instead a software bug/issue that may have put the Separation Sequence/Mode Manager back into its first-boot state by fault,
Getany of thecdh.logging.XXXXLogger.absRowsLoggedparameters from an on-board logger (e.g. ADCS, HK, TED, etc.) that previously hadabsRowsLogged> 0.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
|
Data Size |
List[0:2] of Integer:32 |
Data Info |
The absolute number of rows ever logged to this channel |
Allowed Value(s) |
00000000 - FFFFFFFF (hex) |
B.2.
If the
absRowsLoggedparameter for that logger is still as large (or larger) as the value last noted by the GS, a full first-boot scenario is NOT occurring and a software bug should instead by investigated.Else, if the
absRowsLoggedparameter appears to have reset to 0 since last contact with the GS, this indicates that the image’s persisted configuration data has been erased and that a full first-boot scenario is likely.
Warning
If a first-boot scenario has been confirmed, TC authentication is now disabled. The Operator should consider following the EIR-OPS-009: Enable TC Authentication procedure ASAP to re-enable TC authentication to prevent replay attacks.
C. Resetting absRowsLogged
Important
Repeat the steps in this section for all loggers found at the cdh.logging path in the SCDB, except the GMODTTEBufferLogger . The Operators can jump between this section and Section D, after the absRowsLogged parameter for each logger/channel has been updated.
C.1.
Getthecdh.logging.XXXLogger.channelIdparameter.Take note of the TM returned (i.e. as the ‘ChID’) for use in later steps.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
No |
TM Details |
|
Data Expected from TC |
|
Data Size |
2 bytes |
Data Info |
The ID of the channel to which this logger will log data |
Allowed Value(s) |
1 - 87 (dec) |
C.2.
Setthecdh.logging.XXXLogger.enabledparameter to 0 to temporarily disable logging of this channel’s data type.
Warning
Ideally, the Operators should aim to complete the steps of this section for an individual logger within a single pass so that logging is only briefly disabled.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
1 byte |
Data Info |
|
Allowed Value(s) |
0 (disabled) - 1 (enabled) |
Expected Value(s) |
0 |
TM Details |
|
Data Expected from TC |
No ( + ACK ) |
C.3.
Confirm the
Setin the previous step with aGet(i.e. confirm the value was set successfully).
C.4.
Getthecore.storage.isFullparameter, withFirst row=Last row= ChID (from Step C.1).Take note of whether the channel is full (1) or not (0).
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
4 bytes, 4 bytes |
Data Info |
|
Allowed Value(s) |
1 - 87, 1 - 87 |
Expected Value(s) |
ChID (from Step C.1) |
TM Details |
|
Data Expected from TC |
|
Data Size |
1 byte |
Data Info |
Whether the channel is full (1) or not (0) |
Allowed Value(s) |
0 - 1 |
C.5.
Querythecore.storage.channelContentparameter withParameter index in block= ChID (from Step C.1).If
isFull(from Step C.4) = 0:The new
absRowsLoggedparameter value to be set in a later step equals the number of rows in ChID (returned from this query).With this information in tow, the Operator should now skip ahead to Step C.7.
Else if
isFull= 1:The Operator should promptly downlink the last row of data from ChID and proceed to the next step.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
4 bytes |
Data Info |
|
Allowed Value(s) |
00000001 - FFFFFFFF (hex) |
Expected Value(s) |
ChID (from Step C.1, dec) |
TM Details |
|
Data Expected from TC |
Number of rows in the channel ( + ACK ) |
Data Size |
4 bytes |
Data Info |
Number of rows in channel with that channel ID |
Allowed Value(s) |
0 - 65535 (dec) |
C.6.
Open the data file downlinked in the previous step.
Bytes 4-7 (0 indexed) in this file are the row count.
The new
absRowsLoggedparameter value to be set in a later step equals this row count + 1.
C.7.
Setthecdh.logging.XXXLogger.ARLIsSettableparameter to 1.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
1 byte |
Data Info |
|
Allowed Value(s) |
0 (settable) - 1 (not settable) |
Expected Value(s) |
1 |
TM Details |
|
Data Expected from TC |
No ( + ACK ) |
C.8.
Confirm the
Setin the previous step with aGet(i.e. confirm the value was set successfully).
C.9.
Setthecdh.logging.XXXLogger.absRowsLoggedparameter to equal the newabsRowsLoggedparameter value determined in Step C.5 or C.6.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
4 bytes |
Data Info |
|
Allowed Value(s) |
00000000 - FFFFFFFF (hex) |
Expected Value(s) |
Values determined in Step C.5/C.6 |
TM Details |
|
Data Expected from TC |
No ( + ACK ) |
C.10.
Confirm the
Setin the previous step with aGet(i.e. confirm the value was set successfully).
C.11.
Setthecdh.logging.XXXLogger.enabledparameter to 1 to re-enable logging of this channel type.
TC Details |
|
MCS Operation |
|
Action/Param Name |
|
Data Expected with TC |
Yes |
Data Size |
1 byte |
Data Info |
|
Allowed Value(s) |
0 (disabled) - 1 (enabled) |
Expected Value(s) |
1 |
TM Details |
|
Data Expected from TC |
No ( + ACK ) |
C.12.
Confirm the
Setin the previous step with aGet(i.e. confirm the value was set successfully).
D. Downlinking Data
Warning
If Section C has NOT already been followed for a channel/logger, take care when assessing which rows of data to downlink from the channel/logger as the absRowsLogged parameter, which is used for the downlink logic in EIR-OPS-011: Downlink Data From Storage , will be incorrect after the repeated first-boot scenario unless the parameter is updated via Section C.
D.1.
The
Event,HKandTEDdata logs should be given the highest priority for data downlink to best establish the main chain of events that led to the repeated first-boot scenario.However, this downlink priority should be re-assessed by the Operators between each pass following the analyses performed in Section E, as the analyses will likely reveal that a particular data type may be more relevant than another to fully establish the cause of this anomaly.
With the downlink priority for a given pass established, during the pass, downlink data not previously retrieved by the ground segment from the relevant on-board storage channels according to EIR-OPS-011: Downlink Data From Storage .
Between passes, carry out the analysis proposed in Section E.
E. Debugging the Issue
E.1.
If in Section B, a first-boot scenario was confirmed, the most likely reason for this anomaly is that the no-GS-TC Watchdog was triggered twice in succession and therefore, a configuration wipe and spacecraft reboot automatically occurred in an attempt to put the spacecraft back into a configuration in which 2-way communications can be achieved.
To determine whether this scenario has occurred either (or both) of the following assessments can be performed:
Search the downlinked
Eventdata for ‘NoTCReceived’ events.If this event is seen twice, from the same image, in a period of a 2x
NoTCWatchdogTimeoutdays, then the no-GS-TC watchdog has caused the first-boot scenario and the reason for why this watchdog was triggered should be investigated.
Warning
This search might not be possible if the Event log channel has been filled/overwritten with ADM I2C read/write error events since the first-boot scenario occurred.
Assess the downlinked
TEDdata.If the
platform.CMC.rxPacketCountparameter did not increase for 2xNoTCWatchdogTimeoutdays, then the no-GS-TC watchdog has caused the first-boot scenario and the reason for why this watchdog was triggered should be investigated.
E.2.
It is HIGHLY unlikely that the Operator would not know about commands sent to the spacecraft that would lead to the Separation Sequence restarting. However, the possibility should still be ruled out. Therefore, using the MCS/GS logs verify that the following TCs were not sent to the spacecraft since the state of the spacecraft was last as expected:
Invoke:mission.ModeManager.transitionToSeparationModeInvoke:mission.SeparationSequence.SeparationSequenceRestartInvoke:core.ConfigurationManager.eraseAllInvoke:core.ConfigurationManager.eraseInvoke:core.ConfigurationManager.resetAll
If any of these commands were sent to the spacecraft since its state was last as expected, this Separation Sequence restart was likely a result of the command(s).
E.3.
If neither of the above situations have occurred, the Software Engineer should investigate a possible bug/fault in the flight software, and consider e.g:
Was there a reboot around the time of the fault? If so, was the reboot a result of a full SC power-cycle or an OBC reset?
Is there anything in the Event log to suggest that the persisted configuration data failed to load following a reboot?
…
F. Resuming Nominal Operations
F.1.
Using the procedures in this manual (e.g. see EIR-OPS-024: Boot Into OBC Image and EIR-OPS-012: Set Up Nominal Operations ), the Operator should resume nominal operations (i.e. Primary image + Nominal Mode with the experiment running and data logging on-going) once the anomaly investigation in Section E is complete.
END OF PROCEDURE