From Fedora Project Wiki
No edit summary
(adding tracker bug)
 
(22 intermediate revisions by 2 users not shown)
Line 2: Line 2:


= Enable Drm Panic <!-- The name of your change proposal --> =
= Enable Drm Panic <!-- The name of your change proposal --> =
{{Change_Proposal_Banner}}


== Summary ==
== Summary ==
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. Note that motivation for the change should be in the Benefit to Fedora section below, and this part should answer the question "What?" rather than "Why?". -->
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. Note that motivation for the change should be in the Benefit to Fedora section below, and this part should answer the question "What?" rather than "Why?". -->


Drm_panic is a new feature in the Linux kernel that allows to display a panic screen when a kernel panic occurs.
Drm_panic is a new feature in the Linux kernel that displays a panic screen when a kernel panic occurs. This proposal is to enable DRM_PANIC in the Fedora kernel, to improve the kernel panic user experience.
This proposal is to enable DRM_PANIC in the Fedora kernel, to improve the kernel panic user experience.




Line 17: Line 14:
This should link to your home wiki page so we know who you are.  
This should link to your home wiki page so we know who you are.  
-->
-->
* Name: [[User:jfalempe| Jocelyn Falempe]]
* Name: [[User:jfalempe| Jocelyn Falempe]], [[User:Javierm| Javier Martinez Canillas]]
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
* Email: <jfalempe@redhat.com>
* Email: <jfalempe@redhat.com>, <javierm@redhat.com>
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
-->
-->


== Current status ==
== Current status ==
[[Category:ChangePageIncomplete]]
[[Category:ChangeAcceptedF42]]
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
Line 34: Line 30:
<!-- Select proper category, default is Self Contained Change -->
<!-- Select proper category, default is Self Contained Change -->
[[Category:SelfContainedChange]]
[[Category:SelfContainedChange]]
<!-- [[Category:SystemWideChange]] -->


* Targeted release: [https://docs.fedoraproject.org/en-US/releases/f<VERSION>/ Fedora Linux <VERSION>]
* Targeted release: [https://docs.fedoraproject.org/en-US/releases/f42/ Fedora Linux 42]
* Last updated: <!-- this is an automatic macro — you don't need to change this line -->  {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}}  
* Last updated: <!-- this is an automatic macro — you don't need to change this line -->  {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}}  
<!-- After the change proposal is accepted by FESCo, tracking bug is created in Bugzilla and linked to this page  
<!-- After the change proposal is accepted by FESCo, tracking bug is created in Bugzilla and linked to this page  
Line 44: Line 39:
ON_QA -> change is fully code complete
ON_QA -> change is fully code complete
-->
-->
* [Announced]
* [https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org/thread/OMAPEK63F3MTT3F4YELEIEOJHHC5BPXF/ Announced]
* [<will be assigned by the Wrangler> Discussion thread]
* [https://discussion.fedoraproject.org/t/f42-change-proposal-enable-drm-panic-system-wide/125542 Discussion thread]
* FESCo issue: <will be assigned by the Wrangler>
* FESCo issue: [https://pagure.io/fesco/issue/3254 #3254]
* Tracker bug: <will be assigned by the Wrangler>
* Tracker bug: [https://bugzilla.redhat.com/show_bug.cgi?id=2309205 #2309205]
* Release notes tracker: <will be assigned by the Wrangler>
* Release notes tracker: <will be assigned by the Wrangler>


Line 57: Line 52:
With this feature, they will see a message saying the computer has crashed, and they need to reboot the computer.
With this feature, they will see a message saying the computer has crashed, and they need to reboot the computer.
Drm_panic has been introduced in kernel v6.10, but is still under active development.
Drm_panic has been introduced in kernel v6.10, but is still under active development.
In order to enable DRM_PANIC, you need to disable VT_CONSOLE in the kernel, this is to prevent a race condition, that if you are in a VT console when the panic occurs, both fbcon and drm_panic will write to the framebuffer at the same time, leading to corrupted output.
https://patchwork.freedesktop.org/series/134831/
The drawback is that tty0 won't show the kernel kmsg, and it can be harder to debug boot issue. But plymouth already takes care of this, and can display the boot kmsg when no VT console is present. https://gitlab.freedesktop.org/plymouth/plymouth/-/merge_requests/224
And the user experience would be better, because plymouth has better font and color support than fbcon.


Supported drivers are simpledrm, mgag200, ast, (and imx, tidss, on aarch64). I'm working on nouveau support, and I hope i915 and amdgpu will add support too.
Supported drivers are simpledrm, mgag200, ast, (and imx, tidss, on aarch64). I'm working on nouveau support, and I hope i915 and amdgpu will add support too.
If the driver is not supported, you won't see the panic screen, but it won't be worse than what you have today.
If the driver is not supported, you won't see the panic screen, but it won't be worse than what you have today.


Drm panic provides different panic screen. the default is "user" which will display a simple friendly message telling the user to reboot the computer. But for kernel developer, you can also set it to "kmsg", to see the last kmsg lines (so this is equivalent to the current fbcon).
Drm panic provides different panic screens. The default is "user" which will display a simple friendly message telling the user to reboot the computer. But for kernel developers, you can also set it to "kmsg", to see the last kmsg lines (so this is equivalent to the current fbcon). You can select the panic screen in Kconfig, or as a module parameter (drm.panic_screen=user) or at runtime with "echo -n kmsg > /sys/module/drm/parameters/panic_screen"
You can select the panic screen in Kconfig, or as a module parameter (drm.panic_screen=user) or at runtime with "echo -n kmsg > /sys/module/drm/parameters/panic_screen"


I've also made a proof of concept to add a panic screen with a QR code with debugging information, which will make it easier for users to report kernel panic in Fedora. An example can be seen here:
I've also made a proof of concept to add a panic screen with a QR code with debugging information, which will make it easier for users to report kernel panic in Fedora. An example can be seen here:
Line 104: Line 93:
-->
-->


This change will improve the user experience when a kernel panic occurs.
This change will improve the user experience when a kernel panic occurs. And it will help to report and debug kernel panics.


It's also a first step to switch to userspace console, and being able to disable CONFIG_VT in the kernel.
It's also a first step to switch to userspace console, and being able to disable CONFIG_VT in the kernel.
VT and fbcon are legacy part of the kernel, that would reduce maintenance burden if we can disable them, and
VT and fbcon are legacy part of the kernel, that would reduce maintenance burden if we can disable them.
It will also reduce CVE impact, as userspace vulnerability are usually less critical.
It will also reduce CVE impact, as userspace vulnerabilities are usually less critical.


== Scope ==
== Scope ==
* Proposal owners:
* Proposal owners:
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
No changes are required to the boot process.


* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->


* Release engineering: [https://pagure.io/releng/issues #Releng issue number] <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Release engineering: [https://pagure.io/releng/issues #Releng issue number] <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.  
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.  
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->
There should be no impact on the installer.


* Policies and guidelines: N/A (not needed for this Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Policies and guidelines: N/A (not needed for this Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
Line 129: Line 125:
* Alignment with the Fedora Strategy:  
* Alignment with the Fedora Strategy:  
<!-- Does your proposal align with the current Fedora Strategy: https://discussion.fedoraproject.org/t/fedora-strategy-2028-february-march-planning-work-and-roadmap-til-flock/43618 ? It's okay if it doesn't, but it's something to consider -->
<!-- Does your proposal align with the current Fedora Strategy: https://discussion.fedoraproject.org/t/fedora-strategy-2028-february-march-planning-work-and-roadmap-til-flock/43618 ? It's okay if it doesn't, but it's something to consider -->
I think it perfectly fit the "Fedora is for everyone" goal, as the current kernel panic (either UI freeze or kmsg output in VT) is not user-friendly.


== Upgrade/compatibility impact ==
== Upgrade/compatibility impact ==
<!-- What happens to systems that have had a previous versions of Fedora installed and are updated to the version containing this change? Will anything require manual configuration or data migration? Will any existing functionality be no longer supported? -->
<!-- What happens to systems that have had a previous versions of Fedora installed and are updated to the version containing this change? Will anything require manual configuration or data migration? Will any existing functionality be no longer supported? -->


Enabling DRM_PANIC should be transparent to user, but disabling VT_CONSOLE may have a visible impact.
Enabling DRM_PANIC should be transparent to user.
Fortunately since Fedora 40, plymouth is able to display the kmsg messages.
If your graphic driver doesn't support drm_panic, you will still see the old kernel panic message if you're in a VT, or it will freeze in graphic mode.
For non-graphical boot, you can use systemd.log_target=console systemd.log_level=info and remove rhgb and quiet to see the kernel boot message.
If your graphic driver supports drm_panic, you will see the new panic screen.  


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
Line 180: Line 178:
  - Green has been scientifically proven to be the most relaxing color. The move to a default background color of green with green text will result in Fedora users being the most relaxed users of any operating system.
  - Green has been scientifically proven to be the most relaxing color. The move to a default background color of green with green text will result in Fedora users being the most relaxed users of any operating system.
-->
-->
With DRM panic, users will be notified that their computer crashed, instead of it being unresponsive.
With v6.10, it's only for a few GPU drivers (simpledrm, mgag200, ast), but with simpledrm, it will already catch some common kernel panic cases, like root filesystem not found, or ramdisk corruption. (simpledrm is used at boot, and is later replaced with i915/amdgpu/nouveau ...)
It also prepares for future drm panic improvements, like having a kmsg panic screen, (should  be available in v6.11) or also have better debugging information, using QR code. A test sample is shown at https://github.com/kdj0c/panic_report/issues/1
The panic screen can be customized, (background/foreground colors). And a Fedora logo can be added too, but probably with an out-of-tree patch. Maybe UI or Marketing experts can help make it better fit the Fedora design.


== Dependencies ==
== Dependencies ==
Line 186: Line 192:
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


The main dependency, is to have a kernel v6.11 or later.
You will also need this patch to make drm_panic works with fbcon/vt_console: https://patchwork.freedesktop.org/series/136182/ that will land in v6.12


== Contingency Plan ==
== Contingency Plan ==


<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
* Contingency mechanism: (What to do?  Who will do it?) N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency mechanism: Revert the kernel configuration changes.
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


== Documentation ==
== Documentation ==
<!-- Is there upstream documentation on this change, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
<!-- Is there upstream documentation on this change, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
Kernel Kconfig for DRM_PANIC:
N/A (not a System Wide Change)
https://elixir.bootlin.com/linux/v6.10-rc7/source/drivers/gpu/drm/Kconfig#L107


== Release Notes ==
== Release Notes ==

Latest revision as of 15:05, 2 September 2024


Enable Drm Panic

Summary

Drm_panic is a new feature in the Linux kernel that displays a panic screen when a kernel panic occurs. This proposal is to enable DRM_PANIC in the Fedora kernel, to improve the kernel panic user experience.


Owner

Current status

Detailed Description

When the linux kernel panics in Fedora 40, in most cases, the screen just freezes. If you're in a VT console, you'll be able to see the kernel debug information, but that is pretty hard to understand for users that are not kernel developers. With this feature, they will see a message saying the computer has crashed, and they need to reboot the computer. Drm_panic has been introduced in kernel v6.10, but is still under active development.

Supported drivers are simpledrm, mgag200, ast, (and imx, tidss, on aarch64). I'm working on nouveau support, and I hope i915 and amdgpu will add support too. If the driver is not supported, you won't see the panic screen, but it won't be worse than what you have today.

Drm panic provides different panic screens. The default is "user" which will display a simple friendly message telling the user to reboot the computer. But for kernel developers, you can also set it to "kmsg", to see the last kmsg lines (so this is equivalent to the current fbcon). You can select the panic screen in Kconfig, or as a module parameter (drm.panic_screen=user) or at runtime with "echo -n kmsg > /sys/module/drm/parameters/panic_screen"

I've also made a proof of concept to add a panic screen with a QR code with debugging information, which will make it easier for users to report kernel panic in Fedora. An example can be seen here: https://github.com/kdj0c/panic_report/issues/1

Feedback

Benefit to Fedora

This change will improve the user experience when a kernel panic occurs. And it will help to report and debug kernel panics.

It's also a first step to switch to userspace console, and being able to disable CONFIG_VT in the kernel. VT and fbcon are legacy part of the kernel, that would reduce maintenance burden if we can disable them. It will also reduce CVE impact, as userspace vulnerabilities are usually less critical.

Scope

  • Proposal owners:

No changes are required to the boot process.

  • Other developers:



There should be no impact on the installer.

  • Policies and guidelines: N/A (not needed for this Change)
  • Trademark approval: N/A (not needed for this Change)
  • Alignment with the Fedora Strategy:

I think it perfectly fit the "Fedora is for everyone" goal, as the current kernel panic (either UI freeze or kmsg output in VT) is not user-friendly.

Upgrade/compatibility impact

Enabling DRM_PANIC should be transparent to user. If your graphic driver doesn't support drm_panic, you will still see the old kernel panic message if you're in a VT, or it will freeze in graphic mode. If your graphic driver supports drm_panic, you will see the new panic screen.


Early Testing (Optional)

Do you require 'QA Blueprint' support? Y/N

How To Test

Currently the easiest way to test, is to use the simpledrm driver, as it can run on all hardware. So first blacklist your driver (i915, amdgpu or nouveau), and then boot and check that you're using simpledrm. then you can trigger a kernel panic with: echo c > /proc/sysrq-trigger

As it will crash your machine, it's also possible to do this in a VM (so disabling virtio-gpu, or vmwgfx)

Also to check that you can still see the kernel messages at boot, in the grub menu, remove the "quiet" kernel command argument, and you should still see the kernel boot messages on the plymouth screen.


User Experience

With DRM panic, users will be notified that their computer crashed, instead of it being unresponsive.

With v6.10, it's only for a few GPU drivers (simpledrm, mgag200, ast), but with simpledrm, it will already catch some common kernel panic cases, like root filesystem not found, or ramdisk corruption. (simpledrm is used at boot, and is later replaced with i915/amdgpu/nouveau ...)

It also prepares for future drm panic improvements, like having a kmsg panic screen, (should be available in v6.11) or also have better debugging information, using QR code. A test sample is shown at https://github.com/kdj0c/panic_report/issues/1

The panic screen can be customized, (background/foreground colors). And a Fedora logo can be added too, but probably with an out-of-tree patch. Maybe UI or Marketing experts can help make it better fit the Fedora design.

Dependencies

The main dependency, is to have a kernel v6.11 or later.

You will also need this patch to make drm_panic works with fbcon/vt_console: https://patchwork.freedesktop.org/series/136182/ that will land in v6.12

Contingency Plan

  • Contingency mechanism: Revert the kernel configuration changes.
  • Contingency deadline: N/A (not a System Wide Change)
  • Blocks release? N/A (not a System Wide Change), Yes/No

Documentation

Kernel Kconfig for DRM_PANIC: https://elixir.bootlin.com/linux/v6.10-rc7/source/drivers/gpu/drm/Kconfig#L107

Release Notes