In order to come up with a good data model, you need to identify first ALL queries you would like to perform. If you only need to look up users by their firstname (or firstname and userID), then your second design is fine...
If you also need to look up users by their last name, then you could create another table having the same fields but a primary key on (lastname, userID). Obviously you will need to update both tables in the same time. Data duplication is fine in Cassandra.
Still, if you are concerned about the space needed for the two or more tables, you could create a single users table partitioned by user id, and additional tables for the fields you want to query by:
CREATE TABLE users (
userID uuid,
firstname text,
lastname text,
state text,
zip int,
PRIMARY KEY (userID)
);
CREATE TABLE users_by_firstname (
firstname text,
userid uuid,
PRIMARY KEY (firstname, userid)
);
The disadvantage of this solution is that you will need two queries to retrieve users by their first name:
SELECT userid FROM users_by_firstname WHERE firstname = 'Joe';
SELECT * FROM users WHERE userid IN (...);
Hope this helps
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…