Why is there a need to modify system call tables in Linux?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}
I have been learning about rootkits recently and have noticed this hooking
techniques that kernel-land rootkits use in order perform malicious actions.
Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.
But if that is the case, why not make the system call table to be unmodifiable from the start?
linux operating-systems
add a comment
|
I have been learning about rootkits recently and have noticed this hooking
techniques that kernel-land rootkits use in order perform malicious actions.
Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.
But if that is the case, why not make the system call table to be unmodifiable from the start?
linux operating-systems
26
"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?
– Joseph Sible
May 28 at 0:51
3
Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)
– meoware
May 28 at 1:18
2
I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)
– rackandboneman
May 28 at 22:56
1
@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.
– forest
May 29 at 3:57
add a comment
|
I have been learning about rootkits recently and have noticed this hooking
techniques that kernel-land rootkits use in order perform malicious actions.
Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.
But if that is the case, why not make the system call table to be unmodifiable from the start?
linux operating-systems
I have been learning about rootkits recently and have noticed this hooking
techniques that kernel-land rootkits use in order perform malicious actions.
Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.
But if that is the case, why not make the system call table to be unmodifiable from the start?
linux operating-systems
linux operating-systems
edited May 29 at 15:22
Peter Mortensen
7504 silver badges9 bronze badges
7504 silver badges9 bronze badges
asked May 28 at 0:44
meowaremeoware
1932 silver badges5 bronze badges
1932 silver badges5 bronze badges
26
"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?
– Joseph Sible
May 28 at 0:51
3
Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)
– meoware
May 28 at 1:18
2
I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)
– rackandboneman
May 28 at 22:56
1
@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.
– forest
May 29 at 3:57
add a comment
|
26
"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?
– Joseph Sible
May 28 at 0:51
3
Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)
– meoware
May 28 at 1:18
2
I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)
– rackandboneman
May 28 at 22:56
1
@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.
– forest
May 29 at 3:57
26
26
"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?
– Joseph Sible
May 28 at 0:51
"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?
– Joseph Sible
May 28 at 0:51
3
3
Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)
– meoware
May 28 at 1:18
Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)
– meoware
May 28 at 1:18
2
2
I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)
– rackandboneman
May 28 at 22:56
I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)
– rackandboneman
May 28 at 22:56
1
1
@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.
– forest
May 29 at 3:57
@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.
– forest
May 29 at 3:57
add a comment
|
2 Answers
2
active
oldest
votes
The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:
static void set_addr_rw(const unsigned long addr)
{
unsigned int level;
pte_t *pte;
pte = lookup_address(addr, &level);
if (pte->pte &~ _PAGE_RW)
pte->pte |= _PAGE_RW;
local_flush_tlb();
}
This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:
cli
mov %cr0, %eax
and $~0x10000, %eax
mov %eax, %cr0
sti
This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.
So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.
* The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP
in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.
add a comment
|
As noted by forest, modern Linux does not allow this, but it's easy to override.
However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace
was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;
. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.
4
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
add a comment
|
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "162"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f210897%2fwhy-is-there-a-need-to-modify-system-call-tables-in-linux%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:
static void set_addr_rw(const unsigned long addr)
{
unsigned int level;
pte_t *pte;
pte = lookup_address(addr, &level);
if (pte->pte &~ _PAGE_RW)
pte->pte |= _PAGE_RW;
local_flush_tlb();
}
This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:
cli
mov %cr0, %eax
and $~0x10000, %eax
mov %eax, %cr0
sti
This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.
So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.
* The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP
in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.
add a comment
|
The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:
static void set_addr_rw(const unsigned long addr)
{
unsigned int level;
pte_t *pte;
pte = lookup_address(addr, &level);
if (pte->pte &~ _PAGE_RW)
pte->pte |= _PAGE_RW;
local_flush_tlb();
}
This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:
cli
mov %cr0, %eax
and $~0x10000, %eax
mov %eax, %cr0
sti
This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.
So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.
* The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP
in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.
add a comment
|
The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:
static void set_addr_rw(const unsigned long addr)
{
unsigned int level;
pte_t *pte;
pte = lookup_address(addr, &level);
if (pte->pte &~ _PAGE_RW)
pte->pte |= _PAGE_RW;
local_flush_tlb();
}
This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:
cli
mov %cr0, %eax
and $~0x10000, %eax
mov %eax, %cr0
sti
This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.
So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.
* The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP
in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.
The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:
static void set_addr_rw(const unsigned long addr)
{
unsigned int level;
pte_t *pte;
pte = lookup_address(addr, &level);
if (pte->pte &~ _PAGE_RW)
pte->pte |= _PAGE_RW;
local_flush_tlb();
}
This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:
cli
mov %cr0, %eax
and $~0x10000, %eax
mov %eax, %cr0
sti
This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.
So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.
* The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP
in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.
edited May 30 at 1:28
answered May 28 at 0:58
forestforest
47k19 gold badges150 silver badges170 bronze badges
47k19 gold badges150 silver badges170 bronze badges
add a comment
|
add a comment
|
As noted by forest, modern Linux does not allow this, but it's easy to override.
However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace
was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;
. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.
4
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
add a comment
|
As noted by forest, modern Linux does not allow this, but it's easy to override.
However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace
was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;
. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.
4
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
add a comment
|
As noted by forest, modern Linux does not allow this, but it's easy to override.
However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace
was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;
. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.
As noted by forest, modern Linux does not allow this, but it's easy to override.
However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace
was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;
. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.
edited May 29 at 15:22
Peter Mortensen
7504 silver badges9 bronze badges
7504 silver badges9 bronze badges
answered May 28 at 21:17
R..R..
5,4841 gold badge16 silver badges22 bronze badges
5,4841 gold badge16 silver badges22 bronze badges
4
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
add a comment
|
4
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
4
4
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
Heh, I did the same thing. It's really handy as a hacky fix.
– forest
May 29 at 2:58
add a comment
|
Thanks for contributing an answer to Information Security Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f210897%2fwhy-is-there-a-need-to-modify-system-call-tables-in-linux%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
26
"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?
– Joseph Sible
May 28 at 0:51
3
Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)
– meoware
May 28 at 1:18
2
I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)
– rackandboneman
May 28 at 22:56
1
@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.
– forest
May 29 at 3:57