How to combine multiple text files of different lengths and multiple columns by a column

I have 60 text files of different lengths and same column names.

For example:

cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19258 circRNA

    612 ciRNA



cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  17791 circRNA

    729 ciRNA



cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  22838 circRNA

    686 ciRNA



cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19404 circRNA

    475 ciRNA

I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:

screen shot of the file

edited 13 hours ago

dessert

25.1k673106

asked 15 hours ago

grt

New contributor

add a comment |

I have 60 text files of different lengths and same column names.

For example:

cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19258 circRNA

    612 ciRNA



cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  17791 circRNA

    729 ciRNA



cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  22838 circRNA

    686 ciRNA



cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19404 circRNA

    475 ciRNA

I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:

screen shot of the file

edited 13 hours ago

dessert

25.1k673106

asked 15 hours ago

grt

New contributor

add a comment |

I have 60 text files of different lengths and same column names.

For example:

cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19258 circRNA

    612 ciRNA



cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  17791 circRNA

    729 ciRNA



cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  22838 circRNA

    686 ciRNA



cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19404 circRNA

    475 ciRNA

I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:

screen shot of the file

edited 13 hours ago

dessert

25.1k673106

asked 15 hours ago

grt

New contributor

I have 60 text files of different lengths and same column names.

For example:

cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19258 circRNA

    612 ciRNA



cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  17791 circRNA

    729 ciRNA



cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  22838 circRNA

    686 ciRNA



cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c

  19404 circRNA

    475 ciRNA

I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:

screen shot of the file

command-line

edited 13 hours ago

dessert

25.1k673106

asked 15 hours ago

grt

New contributor

edited 13 hours ago

dessert

25.1k673106

asked 15 hours ago

grt

New contributor

edited 13 hours ago

dessert

25.1k673106

edited 13 hours ago

dessert

25.1k673106

edited 13 hours ago

dessert

25.1k673106

asked 15 hours ago

grt

New contributor

asked 15 hours ago

grt

asked 15 hours ago

grt

New contributor

grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

1 Answer
1

active

oldest

votes

If all of the columns in all of the files are in the same order, then just concat them together with >>:

for x in {1..60}; do 

    # These flags for tail just cut of the top line, which is your headers

    tail -n 2 Sample_$x_blah.txt >> Sample_master.txt

    # and the double carat makes the output append^ 

done

If not, then you can write the translations in awk sort of like you had above, i.e.

$ cat Sample_1.txt 

col1,col2,col3,col4 #etc 

$ cat Sample_2.txt 

col4,col3,col2,col1 

$ cat Sample_1.txt > Sample_Master.txt # no translation needed

$ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt

But with 60 files, that would be more work than- something like writing a python script using python's csv lib...

edited 13 hours ago

dessert

25.1k673106

answered 14 hours ago

rm-vanda

2,29821323

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

grt is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1128946%2fhow-to-combine-multiple-text-files-of-different-lengths-and-multiple-columns-by%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

If all of the columns in all of the files are in the same order, then just concat them together with >>:

for x in {1..60}; do 

    # These flags for tail just cut of the top line, which is your headers

    tail -n 2 Sample_$x_blah.txt >> Sample_master.txt

    # and the double carat makes the output append^ 

done

If not, then you can write the translations in awk sort of like you had above, i.e.

$ cat Sample_1.txt 

col1,col2,col3,col4 #etc 

$ cat Sample_2.txt 

col4,col3,col2,col1 

$ cat Sample_1.txt > Sample_Master.txt # no translation needed

$ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt

But with 60 files, that would be more work than- something like writing a python script using python's csv lib...

edited 13 hours ago

dessert

25.1k673106

answered 14 hours ago

rm-vanda

2,29821323

add a comment |

If all of the columns in all of the files are in the same order, then just concat them together with >>:

for x in {1..60}; do 

    # These flags for tail just cut of the top line, which is your headers

    tail -n 2 Sample_$x_blah.txt >> Sample_master.txt

    # and the double carat makes the output append^ 

done

If not, then you can write the translations in awk sort of like you had above, i.e.

$ cat Sample_1.txt 

col1,col2,col3,col4 #etc 

$ cat Sample_2.txt 

col4,col3,col2,col1 

$ cat Sample_1.txt > Sample_Master.txt # no translation needed

$ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt

But with 60 files, that would be more work than- something like writing a python script using python's csv lib...

edited 13 hours ago

dessert

25.1k673106

answered 14 hours ago

rm-vanda

2,29821323

add a comment |

If all of the columns in all of the files are in the same order, then just concat them together with >>:

for x in {1..60}; do 

    # These flags for tail just cut of the top line, which is your headers

    tail -n 2 Sample_$x_blah.txt >> Sample_master.txt

    # and the double carat makes the output append^ 

done

If not, then you can write the translations in awk sort of like you had above, i.e.

$ cat Sample_1.txt 

col1,col2,col3,col4 #etc 

$ cat Sample_2.txt 

col4,col3,col2,col1 

$ cat Sample_1.txt > Sample_Master.txt # no translation needed

$ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt

But with 60 files, that would be more work than- something like writing a python script using python's csv lib...

edited 13 hours ago

dessert

25.1k673106

answered 14 hours ago

rm-vanda

2,29821323

If all of the columns in all of the files are in the same order, then just concat them together with >>:

for x in {1..60}; do 

    # These flags for tail just cut of the top line, which is your headers

    tail -n 2 Sample_$x_blah.txt >> Sample_master.txt

    # and the double carat makes the output append^ 

done

If not, then you can write the translations in awk sort of like you had above, i.e.

$ cat Sample_1.txt 

col1,col2,col3,col4 #etc 

$ cat Sample_2.txt 

col4,col3,col2,col1 

$ cat Sample_1.txt > Sample_Master.txt # no translation needed

$ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt

But with 60 files, that would be more work than- something like writing a python script using python's csv lib...

edited 13 hours ago

dessert

25.1k673106

answered 14 hours ago

rm-vanda

2,29821323

edited 13 hours ago

dessert

25.1k673106

edited 13 hours ago

dessert

25.1k673106

edited 13 hours ago

dessert

25.1k673106

answered 14 hours ago

rm-vanda

2,29821323

answered 14 hours ago

rm-vanda

2,29821323

answered 14 hours ago

rm-vanda

2,29821323

add a comment |

grt is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

grt is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Ask Ubuntu!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Hcfyk