retrieve food groups from food item list Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsWhat is the best way to propose an item from a set based on previous choices?How to create an array from the list of arrays in pythonClassify sentences containing typos into groupsVisualizing item similaritiesPython - Get FP/TP from Confusion Matrix using a ListExtracting sections from document based on list of keywords - Pythonout of memory error when consrtucting 2d list from 2 numpy arraysHow to convert nested list into a single list in python?unsupported operand type(s) for -: 'list' and 'list' using pythonPython list formatting

Monty Hall Problem-Probability Paradox

What to do with repeated rejections for phd position

Why are vacuum tubes still used in amateur radios?

Find Maximum of any discrete function (not necessarily a PDF)

Creating a body for the spirit of a magic item?

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

Why datecode is SO IMPORTANT to chip manufacturers?

Connecting Mac Book Pro 2017 to 2 Projectors via USB C

Should a wizard buy fine inks every time he want to copy spells into his spellbook?

Why in helicopter autorotation phase the opposing torque is eliminated?

One-one communication

Asymptotics question

RSA find public exponent

The Nth Gryphon Number

If Windows 7 doesn't support WSL, then what is "Subsystem for UNIX-based Applications"?

Trademark violation for app?

Special flights

BITCOIN: on a chart what does it mean for the USD price to be higher then marketcap?

How many time has Arya actually used Needle?

How often does castling occur in grandmaster games?

Why is a lens darker than other ones when applying the same settings?

How many morphisms from 1 to 1+1 can there be?

I can't update due to The repository 'http://download.opensuse.org/repositories/home:/strycore/xUbuntu_16.04 ./ Release' is not signed

Ore hitori de wa kesshite miru koto no deki nai keshiki; It's a view I could never see on my own



retrieve food groups from food item list



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsWhat is the best way to propose an item from a set based on previous choices?How to create an array from the list of arrays in pythonClassify sentences containing typos into groupsVisualizing item similaritiesPython - Get FP/TP from Confusion Matrix using a ListExtracting sections from document based on list of keywords - Pythonout of memory error when consrtucting 2d list from 2 numpy arraysHow to convert nested list into a single list in python?unsupported operand type(s) for -: 'list' and 'list' using pythonPython list formatting










1












$begingroup$


I have a dataframe of food items as follows: I have to create a food_group list that gives the food group it belongs to, for-example all type of yogurts should be in one group called yogurt.



I used a snippet to take the first segment of the comma separated name, but I do not get the result like putting all yogurts in one group



food_group_0 = [i.split(',') for i in data['name']]

food_group = [item[0] for item in food_group_0]


#To count how many of each entry there are in the list you can use the Counter class in the collections module:
from collections import Counter
c = Counter(food_group)
print(c)


the dataframe



0 4-Grain Flakes
1 4-Grain Flakes, Gluten Free
2 4-Grain Flakes, Riihikosken Vehnämylly
3 Almond
4 Almond Drink, Sweetened, Alrpo
5 Almond Drink, Unsweetened, Alrpo
6 Amaranth Flakes
7 Anchovy
8 Apple, Average, With Skin
9 Apple, Domestic, Without Skin
10 Apple, Domestic, With Skin
11 Apple, Dried
12 Apple, Imported, Without Skin
13 Apple, Imported, With Skin
14 Apple Chips
15 Apple Crisp Delight, Apple, Oat Flakes
16 Apple Jam
17 Apple Juice, Unsweetened, Vitamin C
18 Apple Kissel, Apple Soup, Dried Apples
19 Apple Kissel, Apple Soup, Fresh Apples
20 Apple Pie, Basic Sweet Dough, Gluten-Free, Con...
21 Apple Pie, Basic Sweet Dough, Low-Fat Milk
22 Apple Pie, Basic Sweet Dough, Naturally Gluten...
23 Apple Pie, Basic Sweet Dough, Whole Milk
24 Apple Pie, Shortbread Crust
25 Apple Pie, Shortbread Crust, Gluten-Free, Cont...
26 Apple Pie, Shortbread Crust, Naturally Gluten-...
27 Apple Pie, Shortbread Crust With Sour Milk
28 Apple Pie, Soft, Low-Fat Milk
29 Apple Pie With Quark Filling, Shortbread Crust
...
4068 Yoghurt, Plain, A+, Fat 2.5%, 1 Ug Vitamin D, ...
4069 Yoghurt, Plain, A+, Fat 2.5%, Lactose-Free, 1 ...
4070 Yoghurt, Plain, A+, Fat 4%, 1 Ug Vitamin D, La...
4071 Yoghurt, Plain, A+, Fatfree, 1 Ug Vitamin D, L...
4072 Yoghurt, Plain, A+ Greek, 2 % Fat, Lactose-Fre...
4073 Yoghurt, Plain, Ab, 0.2% Fat, Probiotics
4074 Yoghurt, Plain, Ab, 2.5% Fat, Probiotics
4075 Yoghurt, Plain, Activia, 3.4% Fat
4076 Yoghurt, Plain, Arla Protein, 1% Fat, Lactose-...
4077 Yoghurt, Plain, Bulgarian, 9% Fat
4078 Yoghurt, Plain, Fat-Free
4079 Yoghurt, Plain, Fat-Free, Lactose-Free, 1 Ug V...
4080 Yoghurt, Plain, Fat-Free, Low-Lactose, 0.5 Ug ...
4081 Yoghurt, Plain, Greek, 7% Fat, Lactose-Free
4082 Yoghurt, Plain, Organic, 3% Fat
4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-...
4084 Yoghurt, Turkish/Greek, 10% Fat
4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free
4086 Yoghurt Sauce
4087 Yoghurt With Jam, Fat-Free
4088 Yoghurt With Muesli, A+, Fat 3.5%, Low-Lactose
4089 Yoghurt With Quark, Flavoured, Arla, 1.4% Fat,...
4090 Yoghurt With Quark, Flavoured, Luonto+, 1.2% F...
4091 Yoghurt With Quark, Flavoured, Valio, 1.7% Fat...
4092 Zander, Pike-Perch
4093 Zucchini, Boiled Without Salt
4094 Zucchini, Summer Squash
4095 Zucchini Filled With Minced Meat
4096 Zucchini Filled With Soya And Rice
4097 Zucchini Filled With Vegetables









share|improve this question









$endgroup$











  • $begingroup$
    I can not just extract the first word because there will be complications like I will get 4-Grain instead of 4-Grain Flakes for the first item in food list
    $endgroup$
    – KHAN irfan
    55 mins ago










  • $begingroup$
    Are you able to share the data? And why doesn't splitting on the first comma , give the result you expect? It looks like it would work, according to you example data. Perhaps, like in your other question, you could create a multi-index. Yogurt would be the first level, then Plain and e.g. Flavoured would be the second level.
    $endgroup$
    – n1k31t4
    52 mins ago










  • $begingroup$
    @n1k31t4 but 4-Grain would be first level and Grain would be second level. Yes I can share the data
    $endgroup$
    – KHAN irfan
    39 mins ago















1












$begingroup$


I have a dataframe of food items as follows: I have to create a food_group list that gives the food group it belongs to, for-example all type of yogurts should be in one group called yogurt.



I used a snippet to take the first segment of the comma separated name, but I do not get the result like putting all yogurts in one group



food_group_0 = [i.split(',') for i in data['name']]

food_group = [item[0] for item in food_group_0]


#To count how many of each entry there are in the list you can use the Counter class in the collections module:
from collections import Counter
c = Counter(food_group)
print(c)


the dataframe



0 4-Grain Flakes
1 4-Grain Flakes, Gluten Free
2 4-Grain Flakes, Riihikosken Vehnämylly
3 Almond
4 Almond Drink, Sweetened, Alrpo
5 Almond Drink, Unsweetened, Alrpo
6 Amaranth Flakes
7 Anchovy
8 Apple, Average, With Skin
9 Apple, Domestic, Without Skin
10 Apple, Domestic, With Skin
11 Apple, Dried
12 Apple, Imported, Without Skin
13 Apple, Imported, With Skin
14 Apple Chips
15 Apple Crisp Delight, Apple, Oat Flakes
16 Apple Jam
17 Apple Juice, Unsweetened, Vitamin C
18 Apple Kissel, Apple Soup, Dried Apples
19 Apple Kissel, Apple Soup, Fresh Apples
20 Apple Pie, Basic Sweet Dough, Gluten-Free, Con...
21 Apple Pie, Basic Sweet Dough, Low-Fat Milk
22 Apple Pie, Basic Sweet Dough, Naturally Gluten...
23 Apple Pie, Basic Sweet Dough, Whole Milk
24 Apple Pie, Shortbread Crust
25 Apple Pie, Shortbread Crust, Gluten-Free, Cont...
26 Apple Pie, Shortbread Crust, Naturally Gluten-...
27 Apple Pie, Shortbread Crust With Sour Milk
28 Apple Pie, Soft, Low-Fat Milk
29 Apple Pie With Quark Filling, Shortbread Crust
...
4068 Yoghurt, Plain, A+, Fat 2.5%, 1 Ug Vitamin D, ...
4069 Yoghurt, Plain, A+, Fat 2.5%, Lactose-Free, 1 ...
4070 Yoghurt, Plain, A+, Fat 4%, 1 Ug Vitamin D, La...
4071 Yoghurt, Plain, A+, Fatfree, 1 Ug Vitamin D, L...
4072 Yoghurt, Plain, A+ Greek, 2 % Fat, Lactose-Fre...
4073 Yoghurt, Plain, Ab, 0.2% Fat, Probiotics
4074 Yoghurt, Plain, Ab, 2.5% Fat, Probiotics
4075 Yoghurt, Plain, Activia, 3.4% Fat
4076 Yoghurt, Plain, Arla Protein, 1% Fat, Lactose-...
4077 Yoghurt, Plain, Bulgarian, 9% Fat
4078 Yoghurt, Plain, Fat-Free
4079 Yoghurt, Plain, Fat-Free, Lactose-Free, 1 Ug V...
4080 Yoghurt, Plain, Fat-Free, Low-Lactose, 0.5 Ug ...
4081 Yoghurt, Plain, Greek, 7% Fat, Lactose-Free
4082 Yoghurt, Plain, Organic, 3% Fat
4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-...
4084 Yoghurt, Turkish/Greek, 10% Fat
4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free
4086 Yoghurt Sauce
4087 Yoghurt With Jam, Fat-Free
4088 Yoghurt With Muesli, A+, Fat 3.5%, Low-Lactose
4089 Yoghurt With Quark, Flavoured, Arla, 1.4% Fat,...
4090 Yoghurt With Quark, Flavoured, Luonto+, 1.2% F...
4091 Yoghurt With Quark, Flavoured, Valio, 1.7% Fat...
4092 Zander, Pike-Perch
4093 Zucchini, Boiled Without Salt
4094 Zucchini, Summer Squash
4095 Zucchini Filled With Minced Meat
4096 Zucchini Filled With Soya And Rice
4097 Zucchini Filled With Vegetables









share|improve this question









$endgroup$











  • $begingroup$
    I can not just extract the first word because there will be complications like I will get 4-Grain instead of 4-Grain Flakes for the first item in food list
    $endgroup$
    – KHAN irfan
    55 mins ago










  • $begingroup$
    Are you able to share the data? And why doesn't splitting on the first comma , give the result you expect? It looks like it would work, according to you example data. Perhaps, like in your other question, you could create a multi-index. Yogurt would be the first level, then Plain and e.g. Flavoured would be the second level.
    $endgroup$
    – n1k31t4
    52 mins ago










  • $begingroup$
    @n1k31t4 but 4-Grain would be first level and Grain would be second level. Yes I can share the data
    $endgroup$
    – KHAN irfan
    39 mins ago













1












1








1





$begingroup$


I have a dataframe of food items as follows: I have to create a food_group list that gives the food group it belongs to, for-example all type of yogurts should be in one group called yogurt.



I used a snippet to take the first segment of the comma separated name, but I do not get the result like putting all yogurts in one group



food_group_0 = [i.split(',') for i in data['name']]

food_group = [item[0] for item in food_group_0]


#To count how many of each entry there are in the list you can use the Counter class in the collections module:
from collections import Counter
c = Counter(food_group)
print(c)


the dataframe



0 4-Grain Flakes
1 4-Grain Flakes, Gluten Free
2 4-Grain Flakes, Riihikosken Vehnämylly
3 Almond
4 Almond Drink, Sweetened, Alrpo
5 Almond Drink, Unsweetened, Alrpo
6 Amaranth Flakes
7 Anchovy
8 Apple, Average, With Skin
9 Apple, Domestic, Without Skin
10 Apple, Domestic, With Skin
11 Apple, Dried
12 Apple, Imported, Without Skin
13 Apple, Imported, With Skin
14 Apple Chips
15 Apple Crisp Delight, Apple, Oat Flakes
16 Apple Jam
17 Apple Juice, Unsweetened, Vitamin C
18 Apple Kissel, Apple Soup, Dried Apples
19 Apple Kissel, Apple Soup, Fresh Apples
20 Apple Pie, Basic Sweet Dough, Gluten-Free, Con...
21 Apple Pie, Basic Sweet Dough, Low-Fat Milk
22 Apple Pie, Basic Sweet Dough, Naturally Gluten...
23 Apple Pie, Basic Sweet Dough, Whole Milk
24 Apple Pie, Shortbread Crust
25 Apple Pie, Shortbread Crust, Gluten-Free, Cont...
26 Apple Pie, Shortbread Crust, Naturally Gluten-...
27 Apple Pie, Shortbread Crust With Sour Milk
28 Apple Pie, Soft, Low-Fat Milk
29 Apple Pie With Quark Filling, Shortbread Crust
...
4068 Yoghurt, Plain, A+, Fat 2.5%, 1 Ug Vitamin D, ...
4069 Yoghurt, Plain, A+, Fat 2.5%, Lactose-Free, 1 ...
4070 Yoghurt, Plain, A+, Fat 4%, 1 Ug Vitamin D, La...
4071 Yoghurt, Plain, A+, Fatfree, 1 Ug Vitamin D, L...
4072 Yoghurt, Plain, A+ Greek, 2 % Fat, Lactose-Fre...
4073 Yoghurt, Plain, Ab, 0.2% Fat, Probiotics
4074 Yoghurt, Plain, Ab, 2.5% Fat, Probiotics
4075 Yoghurt, Plain, Activia, 3.4% Fat
4076 Yoghurt, Plain, Arla Protein, 1% Fat, Lactose-...
4077 Yoghurt, Plain, Bulgarian, 9% Fat
4078 Yoghurt, Plain, Fat-Free
4079 Yoghurt, Plain, Fat-Free, Lactose-Free, 1 Ug V...
4080 Yoghurt, Plain, Fat-Free, Low-Lactose, 0.5 Ug ...
4081 Yoghurt, Plain, Greek, 7% Fat, Lactose-Free
4082 Yoghurt, Plain, Organic, 3% Fat
4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-...
4084 Yoghurt, Turkish/Greek, 10% Fat
4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free
4086 Yoghurt Sauce
4087 Yoghurt With Jam, Fat-Free
4088 Yoghurt With Muesli, A+, Fat 3.5%, Low-Lactose
4089 Yoghurt With Quark, Flavoured, Arla, 1.4% Fat,...
4090 Yoghurt With Quark, Flavoured, Luonto+, 1.2% F...
4091 Yoghurt With Quark, Flavoured, Valio, 1.7% Fat...
4092 Zander, Pike-Perch
4093 Zucchini, Boiled Without Salt
4094 Zucchini, Summer Squash
4095 Zucchini Filled With Minced Meat
4096 Zucchini Filled With Soya And Rice
4097 Zucchini Filled With Vegetables









share|improve this question









$endgroup$




I have a dataframe of food items as follows: I have to create a food_group list that gives the food group it belongs to, for-example all type of yogurts should be in one group called yogurt.



I used a snippet to take the first segment of the comma separated name, but I do not get the result like putting all yogurts in one group



food_group_0 = [i.split(',') for i in data['name']]

food_group = [item[0] for item in food_group_0]


#To count how many of each entry there are in the list you can use the Counter class in the collections module:
from collections import Counter
c = Counter(food_group)
print(c)


the dataframe



0 4-Grain Flakes
1 4-Grain Flakes, Gluten Free
2 4-Grain Flakes, Riihikosken Vehnämylly
3 Almond
4 Almond Drink, Sweetened, Alrpo
5 Almond Drink, Unsweetened, Alrpo
6 Amaranth Flakes
7 Anchovy
8 Apple, Average, With Skin
9 Apple, Domestic, Without Skin
10 Apple, Domestic, With Skin
11 Apple, Dried
12 Apple, Imported, Without Skin
13 Apple, Imported, With Skin
14 Apple Chips
15 Apple Crisp Delight, Apple, Oat Flakes
16 Apple Jam
17 Apple Juice, Unsweetened, Vitamin C
18 Apple Kissel, Apple Soup, Dried Apples
19 Apple Kissel, Apple Soup, Fresh Apples
20 Apple Pie, Basic Sweet Dough, Gluten-Free, Con...
21 Apple Pie, Basic Sweet Dough, Low-Fat Milk
22 Apple Pie, Basic Sweet Dough, Naturally Gluten...
23 Apple Pie, Basic Sweet Dough, Whole Milk
24 Apple Pie, Shortbread Crust
25 Apple Pie, Shortbread Crust, Gluten-Free, Cont...
26 Apple Pie, Shortbread Crust, Naturally Gluten-...
27 Apple Pie, Shortbread Crust With Sour Milk
28 Apple Pie, Soft, Low-Fat Milk
29 Apple Pie With Quark Filling, Shortbread Crust
...
4068 Yoghurt, Plain, A+, Fat 2.5%, 1 Ug Vitamin D, ...
4069 Yoghurt, Plain, A+, Fat 2.5%, Lactose-Free, 1 ...
4070 Yoghurt, Plain, A+, Fat 4%, 1 Ug Vitamin D, La...
4071 Yoghurt, Plain, A+, Fatfree, 1 Ug Vitamin D, L...
4072 Yoghurt, Plain, A+ Greek, 2 % Fat, Lactose-Fre...
4073 Yoghurt, Plain, Ab, 0.2% Fat, Probiotics
4074 Yoghurt, Plain, Ab, 2.5% Fat, Probiotics
4075 Yoghurt, Plain, Activia, 3.4% Fat
4076 Yoghurt, Plain, Arla Protein, 1% Fat, Lactose-...
4077 Yoghurt, Plain, Bulgarian, 9% Fat
4078 Yoghurt, Plain, Fat-Free
4079 Yoghurt, Plain, Fat-Free, Lactose-Free, 1 Ug V...
4080 Yoghurt, Plain, Fat-Free, Low-Lactose, 0.5 Ug ...
4081 Yoghurt, Plain, Greek, 7% Fat, Lactose-Free
4082 Yoghurt, Plain, Organic, 3% Fat
4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-...
4084 Yoghurt, Turkish/Greek, 10% Fat
4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free
4086 Yoghurt Sauce
4087 Yoghurt With Jam, Fat-Free
4088 Yoghurt With Muesli, A+, Fat 3.5%, Low-Lactose
4089 Yoghurt With Quark, Flavoured, Arla, 1.4% Fat,...
4090 Yoghurt With Quark, Flavoured, Luonto+, 1.2% F...
4091 Yoghurt With Quark, Flavoured, Valio, 1.7% Fat...
4092 Zander, Pike-Perch
4093 Zucchini, Boiled Without Salt
4094 Zucchini, Summer Squash
4095 Zucchini Filled With Minced Meat
4096 Zucchini Filled With Soya And Rice
4097 Zucchini Filled With Vegetables






python






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 58 mins ago









KHAN irfanKHAN irfan

10010




10010











  • $begingroup$
    I can not just extract the first word because there will be complications like I will get 4-Grain instead of 4-Grain Flakes for the first item in food list
    $endgroup$
    – KHAN irfan
    55 mins ago










  • $begingroup$
    Are you able to share the data? And why doesn't splitting on the first comma , give the result you expect? It looks like it would work, according to you example data. Perhaps, like in your other question, you could create a multi-index. Yogurt would be the first level, then Plain and e.g. Flavoured would be the second level.
    $endgroup$
    – n1k31t4
    52 mins ago










  • $begingroup$
    @n1k31t4 but 4-Grain would be first level and Grain would be second level. Yes I can share the data
    $endgroup$
    – KHAN irfan
    39 mins ago
















  • $begingroup$
    I can not just extract the first word because there will be complications like I will get 4-Grain instead of 4-Grain Flakes for the first item in food list
    $endgroup$
    – KHAN irfan
    55 mins ago










  • $begingroup$
    Are you able to share the data? And why doesn't splitting on the first comma , give the result you expect? It looks like it would work, according to you example data. Perhaps, like in your other question, you could create a multi-index. Yogurt would be the first level, then Plain and e.g. Flavoured would be the second level.
    $endgroup$
    – n1k31t4
    52 mins ago










  • $begingroup$
    @n1k31t4 but 4-Grain would be first level and Grain would be second level. Yes I can share the data
    $endgroup$
    – KHAN irfan
    39 mins ago















$begingroup$
I can not just extract the first word because there will be complications like I will get 4-Grain instead of 4-Grain Flakes for the first item in food list
$endgroup$
– KHAN irfan
55 mins ago




$begingroup$
I can not just extract the first word because there will be complications like I will get 4-Grain instead of 4-Grain Flakes for the first item in food list
$endgroup$
– KHAN irfan
55 mins ago












$begingroup$
Are you able to share the data? And why doesn't splitting on the first comma , give the result you expect? It looks like it would work, according to you example data. Perhaps, like in your other question, you could create a multi-index. Yogurt would be the first level, then Plain and e.g. Flavoured would be the second level.
$endgroup$
– n1k31t4
52 mins ago




$begingroup$
Are you able to share the data? And why doesn't splitting on the first comma , give the result you expect? It looks like it would work, according to you example data. Perhaps, like in your other question, you could create a multi-index. Yogurt would be the first level, then Plain and e.g. Flavoured would be the second level.
$endgroup$
– n1k31t4
52 mins ago












$begingroup$
@n1k31t4 but 4-Grain would be first level and Grain would be second level. Yes I can share the data
$endgroup$
– KHAN irfan
39 mins ago




$begingroup$
@n1k31t4 but 4-Grain would be first level and Grain would be second level. Yes I can share the data
$endgroup$
– KHAN irfan
39 mins ago










1 Answer
1






active

oldest

votes


















2












$begingroup$

You can actually do the string-spitting and indexing on the columns themselves - no need to extract the column and do list comprehensions.



Below I take whatever is before the first comma and put it in a column called food_group and then the first field after the same column and put it in a new column called sub_cat-egory:



df["food_group"] = df.name.str.split(",").str[0]
df["sub_cat"] = df.name.str.split(",").str[1]


Here is example output for some Yogurt data:



 id name food_group sub_cat

44 4082 Yoghurt, Plain, Organic, 3% Fat Yoghurt Plain
45 4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-... Yoghurt Plain
46 4084 Yoghurt, Turkish/Greek, 10% Fat Yoghurt Turkish/Greek
47 4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free Yoghurt Turkish/Greek
48 4086 Yoghurt Sauce Yoghurt Sauce NaN


Notice that any fields that are empty are filled with NaN. This will happen, when your name column only contains a single field (i.e. no commas).



EDIT



Here is the top of my dataframe, after the operation above:



In [13]: df.head(10) 
Out[13]:
id name food_group sub_cat
0 0 4-Grain Flakes 4-Grain Flakes NaN
1 1 4-Grain Flakes, Gluten Free 4-Grain Flakes Gluten Free
2 2 4-Grain Flakes, Riihikosken Vehnämylly 4-Grain Flakes Riihikosken Vehnämylly
3 3 Almond Almond NaN
4 4 Almond Drink, Sweetened, Alrpo Almond Drink Sweetened
5 5 Almond Drink, Unsweetened, Alrpo Almond Drink Unsweetened
6 6 Amaranth Flakes Amaranth Flakes NaN
7 7 Anchovy Anchovy NaN
8 8 Apple, Average, With Skin Apple Average
9 9 Apple, Domestic, Without Skin Apple Domestic



You could continue to make a multi-index from these two new columns, but is might not be necessary - it depends on what you want to do afterwards with the data.






share|improve this answer











$endgroup$












  • $begingroup$
    the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
    $endgroup$
    – KHAN irfan
    28 mins ago










  • $begingroup$
    @KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
    $endgroup$
    – n1k31t4
    20 mins ago










  • $begingroup$
    Yoghurt and Yoghurt With Quark will be a separate food catagory?
    $endgroup$
    – KHAN irfan
    18 mins ago










  • $begingroup$
    Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
    $endgroup$
    – n1k31t4
    11 mins ago










  • $begingroup$
    Thanks for your input. Please try my snippet, it does the same thing. :)
    $endgroup$
    – KHAN irfan
    7 mins ago











Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49641%2fretrieve-food-groups-from-food-item-list%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

You can actually do the string-spitting and indexing on the columns themselves - no need to extract the column and do list comprehensions.



Below I take whatever is before the first comma and put it in a column called food_group and then the first field after the same column and put it in a new column called sub_cat-egory:



df["food_group"] = df.name.str.split(",").str[0]
df["sub_cat"] = df.name.str.split(",").str[1]


Here is example output for some Yogurt data:



 id name food_group sub_cat

44 4082 Yoghurt, Plain, Organic, 3% Fat Yoghurt Plain
45 4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-... Yoghurt Plain
46 4084 Yoghurt, Turkish/Greek, 10% Fat Yoghurt Turkish/Greek
47 4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free Yoghurt Turkish/Greek
48 4086 Yoghurt Sauce Yoghurt Sauce NaN


Notice that any fields that are empty are filled with NaN. This will happen, when your name column only contains a single field (i.e. no commas).



EDIT



Here is the top of my dataframe, after the operation above:



In [13]: df.head(10) 
Out[13]:
id name food_group sub_cat
0 0 4-Grain Flakes 4-Grain Flakes NaN
1 1 4-Grain Flakes, Gluten Free 4-Grain Flakes Gluten Free
2 2 4-Grain Flakes, Riihikosken Vehnämylly 4-Grain Flakes Riihikosken Vehnämylly
3 3 Almond Almond NaN
4 4 Almond Drink, Sweetened, Alrpo Almond Drink Sweetened
5 5 Almond Drink, Unsweetened, Alrpo Almond Drink Unsweetened
6 6 Amaranth Flakes Amaranth Flakes NaN
7 7 Anchovy Anchovy NaN
8 8 Apple, Average, With Skin Apple Average
9 9 Apple, Domestic, Without Skin Apple Domestic



You could continue to make a multi-index from these two new columns, but is might not be necessary - it depends on what you want to do afterwards with the data.






share|improve this answer











$endgroup$












  • $begingroup$
    the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
    $endgroup$
    – KHAN irfan
    28 mins ago










  • $begingroup$
    @KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
    $endgroup$
    – n1k31t4
    20 mins ago










  • $begingroup$
    Yoghurt and Yoghurt With Quark will be a separate food catagory?
    $endgroup$
    – KHAN irfan
    18 mins ago










  • $begingroup$
    Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
    $endgroup$
    – n1k31t4
    11 mins ago










  • $begingroup$
    Thanks for your input. Please try my snippet, it does the same thing. :)
    $endgroup$
    – KHAN irfan
    7 mins ago















2












$begingroup$

You can actually do the string-spitting and indexing on the columns themselves - no need to extract the column and do list comprehensions.



Below I take whatever is before the first comma and put it in a column called food_group and then the first field after the same column and put it in a new column called sub_cat-egory:



df["food_group"] = df.name.str.split(",").str[0]
df["sub_cat"] = df.name.str.split(",").str[1]


Here is example output for some Yogurt data:



 id name food_group sub_cat

44 4082 Yoghurt, Plain, Organic, 3% Fat Yoghurt Plain
45 4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-... Yoghurt Plain
46 4084 Yoghurt, Turkish/Greek, 10% Fat Yoghurt Turkish/Greek
47 4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free Yoghurt Turkish/Greek
48 4086 Yoghurt Sauce Yoghurt Sauce NaN


Notice that any fields that are empty are filled with NaN. This will happen, when your name column only contains a single field (i.e. no commas).



EDIT



Here is the top of my dataframe, after the operation above:



In [13]: df.head(10) 
Out[13]:
id name food_group sub_cat
0 0 4-Grain Flakes 4-Grain Flakes NaN
1 1 4-Grain Flakes, Gluten Free 4-Grain Flakes Gluten Free
2 2 4-Grain Flakes, Riihikosken Vehnämylly 4-Grain Flakes Riihikosken Vehnämylly
3 3 Almond Almond NaN
4 4 Almond Drink, Sweetened, Alrpo Almond Drink Sweetened
5 5 Almond Drink, Unsweetened, Alrpo Almond Drink Unsweetened
6 6 Amaranth Flakes Amaranth Flakes NaN
7 7 Anchovy Anchovy NaN
8 8 Apple, Average, With Skin Apple Average
9 9 Apple, Domestic, Without Skin Apple Domestic



You could continue to make a multi-index from these two new columns, but is might not be necessary - it depends on what you want to do afterwards with the data.






share|improve this answer











$endgroup$












  • $begingroup$
    the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
    $endgroup$
    – KHAN irfan
    28 mins ago










  • $begingroup$
    @KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
    $endgroup$
    – n1k31t4
    20 mins ago










  • $begingroup$
    Yoghurt and Yoghurt With Quark will be a separate food catagory?
    $endgroup$
    – KHAN irfan
    18 mins ago










  • $begingroup$
    Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
    $endgroup$
    – n1k31t4
    11 mins ago










  • $begingroup$
    Thanks for your input. Please try my snippet, it does the same thing. :)
    $endgroup$
    – KHAN irfan
    7 mins ago













2












2








2





$begingroup$

You can actually do the string-spitting and indexing on the columns themselves - no need to extract the column and do list comprehensions.



Below I take whatever is before the first comma and put it in a column called food_group and then the first field after the same column and put it in a new column called sub_cat-egory:



df["food_group"] = df.name.str.split(",").str[0]
df["sub_cat"] = df.name.str.split(",").str[1]


Here is example output for some Yogurt data:



 id name food_group sub_cat

44 4082 Yoghurt, Plain, Organic, 3% Fat Yoghurt Plain
45 4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-... Yoghurt Plain
46 4084 Yoghurt, Turkish/Greek, 10% Fat Yoghurt Turkish/Greek
47 4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free Yoghurt Turkish/Greek
48 4086 Yoghurt Sauce Yoghurt Sauce NaN


Notice that any fields that are empty are filled with NaN. This will happen, when your name column only contains a single field (i.e. no commas).



EDIT



Here is the top of my dataframe, after the operation above:



In [13]: df.head(10) 
Out[13]:
id name food_group sub_cat
0 0 4-Grain Flakes 4-Grain Flakes NaN
1 1 4-Grain Flakes, Gluten Free 4-Grain Flakes Gluten Free
2 2 4-Grain Flakes, Riihikosken Vehnämylly 4-Grain Flakes Riihikosken Vehnämylly
3 3 Almond Almond NaN
4 4 Almond Drink, Sweetened, Alrpo Almond Drink Sweetened
5 5 Almond Drink, Unsweetened, Alrpo Almond Drink Unsweetened
6 6 Amaranth Flakes Amaranth Flakes NaN
7 7 Anchovy Anchovy NaN
8 8 Apple, Average, With Skin Apple Average
9 9 Apple, Domestic, Without Skin Apple Domestic



You could continue to make a multi-index from these two new columns, but is might not be necessary - it depends on what you want to do afterwards with the data.






share|improve this answer











$endgroup$



You can actually do the string-spitting and indexing on the columns themselves - no need to extract the column and do list comprehensions.



Below I take whatever is before the first comma and put it in a column called food_group and then the first field after the same column and put it in a new column called sub_cat-egory:



df["food_group"] = df.name.str.split(",").str[0]
df["sub_cat"] = df.name.str.split(",").str[1]


Here is example output for some Yogurt data:



 id name food_group sub_cat

44 4082 Yoghurt, Plain, Organic, 3% Fat Yoghurt Plain
45 4083 Yoghurt, Plain, Pirkka Reducol, 2.5% Fat, Low-... Yoghurt Plain
46 4084 Yoghurt, Turkish/Greek, 10% Fat Yoghurt Turkish/Greek
47 4085 Yoghurt, Turkish/Greek, 10% Fat, Lactose-Free Yoghurt Turkish/Greek
48 4086 Yoghurt Sauce Yoghurt Sauce NaN


Notice that any fields that are empty are filled with NaN. This will happen, when your name column only contains a single field (i.e. no commas).



EDIT



Here is the top of my dataframe, after the operation above:



In [13]: df.head(10) 
Out[13]:
id name food_group sub_cat
0 0 4-Grain Flakes 4-Grain Flakes NaN
1 1 4-Grain Flakes, Gluten Free 4-Grain Flakes Gluten Free
2 2 4-Grain Flakes, Riihikosken Vehnämylly 4-Grain Flakes Riihikosken Vehnämylly
3 3 Almond Almond NaN
4 4 Almond Drink, Sweetened, Alrpo Almond Drink Sweetened
5 5 Almond Drink, Unsweetened, Alrpo Almond Drink Unsweetened
6 6 Amaranth Flakes Amaranth Flakes NaN
7 7 Anchovy Anchovy NaN
8 8 Apple, Average, With Skin Apple Average
9 9 Apple, Domestic, Without Skin Apple Domestic



You could continue to make a multi-index from these two new columns, but is might not be necessary - it depends on what you want to do afterwards with the data.







share|improve this answer














share|improve this answer



share|improve this answer








edited 19 mins ago

























answered 33 mins ago









n1k31t4n1k31t4

6,6062421




6,6062421











  • $begingroup$
    the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
    $endgroup$
    – KHAN irfan
    28 mins ago










  • $begingroup$
    @KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
    $endgroup$
    – n1k31t4
    20 mins ago










  • $begingroup$
    Yoghurt and Yoghurt With Quark will be a separate food catagory?
    $endgroup$
    – KHAN irfan
    18 mins ago










  • $begingroup$
    Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
    $endgroup$
    – n1k31t4
    11 mins ago










  • $begingroup$
    Thanks for your input. Please try my snippet, it does the same thing. :)
    $endgroup$
    – KHAN irfan
    7 mins ago
















  • $begingroup$
    the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
    $endgroup$
    – KHAN irfan
    28 mins ago










  • $begingroup$
    @KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
    $endgroup$
    – n1k31t4
    20 mins ago










  • $begingroup$
    Yoghurt and Yoghurt With Quark will be a separate food catagory?
    $endgroup$
    – KHAN irfan
    18 mins ago










  • $begingroup$
    Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
    $endgroup$
    – n1k31t4
    11 mins ago










  • $begingroup$
    Thanks for your input. Please try my snippet, it does the same thing. :)
    $endgroup$
    – KHAN irfan
    7 mins ago















$begingroup$
the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
$endgroup$
– KHAN irfan
28 mins ago




$begingroup$
the first name is 4-Grain Flakes, I will only get 4-Grain, how can I handle it?
$endgroup$
– KHAN irfan
28 mins ago












$begingroup$
@KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
$endgroup$
– n1k31t4
20 mins ago




$begingroup$
@KHANirfan - I am splitting on the , - meaning I do indeed get 4-Grain Flakes. See the top of my dataframe, added to my answer.
$endgroup$
– n1k31t4
20 mins ago












$begingroup$
Yoghurt and Yoghurt With Quark will be a separate food catagory?
$endgroup$
– KHAN irfan
18 mins ago




$begingroup$
Yoghurt and Yoghurt With Quark will be a separate food catagory?
$endgroup$
– KHAN irfan
18 mins ago












$begingroup$
Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
$endgroup$
– n1k31t4
11 mins ago




$begingroup$
Yes. Everything to the left of the first comma is taken. If you want to be more specific with you categories, you probably can't do it in a straightforward manner, as I have above. If each row might have its own rules, you will have to probably fix the strange cases by hand, or generate a new input file that reflects your ideas about what is a food category.
$endgroup$
– n1k31t4
11 mins ago












$begingroup$
Thanks for your input. Please try my snippet, it does the same thing. :)
$endgroup$
– KHAN irfan
7 mins ago




$begingroup$
Thanks for your input. Please try my snippet, it does the same thing. :)
$endgroup$
– KHAN irfan
7 mins ago

















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49641%2fretrieve-food-groups-from-food-item-list%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bruad Bilen | Luke uk diar | NawigatsjuunCommonskategorii: BruadCommonskategorii: RunstükenWikiquote: Bruad

Færeyskur hestur Heimild | Tengill | Tilvísanir | LeiðsagnarvalRossið - síða um færeyska hrossið á færeyskuGott ár hjá færeyska hestinum

He _____ here since 1970 . Answer needed [closed]What does “since he was so high” mean?Meaning of “catch birds for”?How do I ensure “since” takes the meaning I want?“Who cares here” meaningWhat does “right round toward” mean?the time tense (had now been detected)What does the phrase “ring around the roses” mean here?Correct usage of “visited upon”Meaning of “foiled rail sabotage bid”It was the third time I had gone to Rome or It is the third time I had been to Rome