Why is there a need to modify system call tables in Linux?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}








34

















I have been learning about rootkits recently and have noticed this hooking techniques that kernel-land rootkits use in order perform malicious actions.



Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.



But if that is the case, why not make the system call table to be unmodifiable from the start?










share|improve this question
























  • 26





    "make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?

    – Joseph Sible
    May 28 at 0:51






  • 3





    Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)

    – meoware
    May 28 at 1:18






  • 2





    I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)

    – rackandboneman
    May 28 at 22:56






  • 1





    @JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.

    – forest
    May 29 at 3:57




















34

















I have been learning about rootkits recently and have noticed this hooking techniques that kernel-land rootkits use in order perform malicious actions.



Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.



But if that is the case, why not make the system call table to be unmodifiable from the start?










share|improve this question
























  • 26





    "make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?

    – Joseph Sible
    May 28 at 0:51






  • 3





    Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)

    – meoware
    May 28 at 1:18






  • 2





    I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)

    – rackandboneman
    May 28 at 22:56






  • 1





    @JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.

    – forest
    May 29 at 3:57
















34












34








34


4






I have been learning about rootkits recently and have noticed this hooking techniques that kernel-land rootkits use in order perform malicious actions.



Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.



But if that is the case, why not make the system call table to be unmodifiable from the start?










share|improve this question
















I have been learning about rootkits recently and have noticed this hooking techniques that kernel-land rootkits use in order perform malicious actions.



Where a typical hooking operation would be to hook on to a legitimate system call, and then replace the legitimate action with the malicious action first, before actually calling the legitimate action.



But if that is the case, why not make the system call table to be unmodifiable from the start?







linux operating-systems






share|improve this question















share|improve this question













share|improve this question




share|improve this question



share|improve this question








edited May 29 at 15:22









Peter Mortensen

7504 silver badges9 bronze badges




7504 silver badges9 bronze badges










asked May 28 at 0:44









meowaremeoware

1932 silver badges5 bronze badges




1932 silver badges5 bronze badges











  • 26





    "make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?

    – Joseph Sible
    May 28 at 0:51






  • 3





    Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)

    – meoware
    May 28 at 1:18






  • 2





    I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)

    – rackandboneman
    May 28 at 22:56






  • 1





    @JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.

    – forest
    May 29 at 3:57
















  • 26





    "make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?

    – Joseph Sible
    May 28 at 0:51






  • 3





    Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)

    – meoware
    May 28 at 1:18






  • 2





    I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)

    – rackandboneman
    May 28 at 22:56






  • 1





    @JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.

    – forest
    May 29 at 3:57










26




26





"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?

– Joseph Sible
May 28 at 0:51





"make the system call table to be unmodifiable from the start" How do you propose doing that (in a way that wouldn't be trivial for a kernel rootkit to undo)?

– Joseph Sible
May 28 at 0:51




3




3





Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)

– meoware
May 28 at 1:18





Oh okay, I see where you are coming from. That made a lot of sense! Thanks :)

– meoware
May 28 at 1:18




2




2





I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)

– rackandboneman
May 28 at 22:56





I guess on x86, taking advantage of all four instead of only two privilege rings would allow some possibilities here :)

– rackandboneman
May 28 at 22:56




1




1





@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.

– forest
May 29 at 3:57







@JosephSible There are a few ways to do that on a modern CPU (esp. with EPT and VT-x), but there are still a thousand other ways to hook syscalls that don't involve modifying the syscall table.

– forest
May 29 at 3:57












2 Answers
2






active

oldest

votes


















58


















The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:



static void set_addr_rw(const unsigned long addr)
{
unsigned int level;
pte_t *pte;

pte = lookup_address(addr, &level);
if (pte->pte &~ _PAGE_RW)
pte->pte |= _PAGE_RW;

local_flush_tlb();
}


This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:



cli
mov %cr0, %eax
and $~0x10000, %eax
mov %eax, %cr0
sti


This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.



So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.



* The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.






share|improve this answer




































    22


















    As noted by forest, modern Linux does not allow this, but it's easy to override.



    However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.






    share|improve this answer
























    • 4





      Heh, I did the same thing. It's really handy as a hacky fix.

      – forest
      May 29 at 2:58















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "162"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });















    draft saved

    draft discarded
















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f210897%2fwhy-is-there-a-need-to-modify-system-call-tables-in-linux%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown


























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    58


















    The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:



    static void set_addr_rw(const unsigned long addr)
    {
    unsigned int level;
    pte_t *pte;

    pte = lookup_address(addr, &level);
    if (pte->pte &~ _PAGE_RW)
    pte->pte |= _PAGE_RW;

    local_flush_tlb();
    }


    This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:



    cli
    mov %cr0, %eax
    and $~0x10000, %eax
    mov %eax, %cr0
    sti


    This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.



    So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.



    * The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.






    share|improve this answer

































      58


















      The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:



      static void set_addr_rw(const unsigned long addr)
      {
      unsigned int level;
      pte_t *pte;

      pte = lookup_address(addr, &level);
      if (pte->pte &~ _PAGE_RW)
      pte->pte |= _PAGE_RW;

      local_flush_tlb();
      }


      This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:



      cli
      mov %cr0, %eax
      and $~0x10000, %eax
      mov %eax, %cr0
      sti


      This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.



      So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.



      * The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.






      share|improve this answer































        58














        58










        58









        The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:



        static void set_addr_rw(const unsigned long addr)
        {
        unsigned int level;
        pte_t *pte;

        pte = lookup_address(addr, &level);
        if (pte->pte &~ _PAGE_RW)
        pte->pte |= _PAGE_RW;

        local_flush_tlb();
        }


        This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:



        cli
        mov %cr0, %eax
        and $~0x10000, %eax
        mov %eax, %cr0
        sti


        This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.



        So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.



        * The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.






        share|improve this answer
















        The syscall table is read-only, and has been since kernel 2.6.16. However, a kernel rootkit has the ability to make it writable again. All it needs to do is execute a function like this* with the table as the argument:



        static void set_addr_rw(const unsigned long addr)
        {
        unsigned int level;
        pte_t *pte;

        pte = lookup_address(addr, &level);
        if (pte->pte &~ _PAGE_RW)
        pte->pte |= _PAGE_RW;

        local_flush_tlb();
        }


        This changes the permissions of the syscall table and makes it possible to edit it. If this doesn't work for whatever reason, then write protection in the kernel can be globally disabled with the following ASM:



        cli
        mov %cr0, %eax
        and $~0x10000, %eax
        mov %eax, %cr0
        sti


        This disables interrupts, disables the WP (Write-Protect) bit in CR0, and re-enables interrupts.



        So why is it marked as read-only if it's so easy to disable? One reason is that vulnerabilities exist which allow modifying kernel memory but not necessarily directly executing code. By marking critical areas of the kernel as read-only, it becomes more difficult to exploit them without finding an additional vulnerability to mark the pages as writable (or disable write-protection altogether). This doesn't provide very strong security, so the main reason that it is marked as read-only is to make it easier to catch any accidental overwrites caused by bugs from causing a catastrophic and unrecoverable system crash.



        * The kernel's internal API changes all the time, so this exact function may not work on older kernels or newer kernels. Globally disabling CR0.WP in ASM however is guaranteed to work on all x86 systems regardless of the kernel version.







        share|improve this answer















        share|improve this answer




        share|improve this answer



        share|improve this answer








        edited May 30 at 1:28

























        answered May 28 at 0:58









        forestforest

        47k19 gold badges150 silver badges170 bronze badges




        47k19 gold badges150 silver badges170 bronze badges




























            22


















            As noted by forest, modern Linux does not allow this, but it's easy to override.



            However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.






            share|improve this answer
























            • 4





              Heh, I did the same thing. It's really handy as a hacky fix.

              – forest
              May 29 at 2:58


















            22


















            As noted by forest, modern Linux does not allow this, but it's easy to override.



            However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.






            share|improve this answer
























            • 4





              Heh, I did the same thing. It's really handy as a hacky fix.

              – forest
              May 29 at 2:58
















            22














            22










            22









            As noted by forest, modern Linux does not allow this, but it's easy to override.



            However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.






            share|improve this answer
















            As noted by forest, modern Linux does not allow this, but it's easy to override.



            However, historically it was useful (and maybe still is) for security purposes: hot-patching against vulnerabilities. Back in the 1990s and early 2000s, whenever a new vulnerability was announced for a syscall I didn't absolutely need (ptrace was a really common one back then), I'd write a kernel module to overwrite the function address in the syscall table with the address of a function that just performed return -ENOSYS;. This eliminated the attack surface until an upgraded kernel was available. For some dubious syscalls I didn't need that repeatedly had vulnerabilities, I just preemptively did this for them and left the module enabled all the time.







            share|improve this answer















            share|improve this answer




            share|improve this answer



            share|improve this answer








            edited May 29 at 15:22









            Peter Mortensen

            7504 silver badges9 bronze badges




            7504 silver badges9 bronze badges










            answered May 28 at 21:17









            R..R..

            5,4841 gold badge16 silver badges22 bronze badges




            5,4841 gold badge16 silver badges22 bronze badges











            • 4





              Heh, I did the same thing. It's really handy as a hacky fix.

              – forest
              May 29 at 2:58
















            • 4





              Heh, I did the same thing. It's really handy as a hacky fix.

              – forest
              May 29 at 2:58










            4




            4





            Heh, I did the same thing. It's really handy as a hacky fix.

            – forest
            May 29 at 2:58







            Heh, I did the same thing. It's really handy as a hacky fix.

            – forest
            May 29 at 2:58





















            draft saved

            draft discarded



















































            Thanks for contributing an answer to Information Security Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f210897%2fwhy-is-there-a-need-to-modify-system-call-tables-in-linux%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown









            Popular posts from this blog

            Bruad Bilen | Luke uk diar | NawigatsjuunCommonskategorii: BruadCommonskategorii: RunstükenWikiquote: Bruad

            Færeyskur hestur Heimild | Tengill | Tilvísanir | LeiðsagnarvalRossið - síða um færeyska hrossið á færeyskuGott ár hjá færeyska hestinum

            He _____ here since 1970 . Answer needed [closed]What does “since he was so high” mean?Meaning of “catch birds for”?How do I ensure “since” takes the meaning I want?“Who cares here” meaningWhat does “right round toward” mean?the time tense (had now been detected)What does the phrase “ring around the roses” mean here?Correct usage of “visited upon”Meaning of “foiled rail sabotage bid”It was the third time I had gone to Rome or It is the third time I had been to Rome