One of the most irritating things about using GAE for a brand new app is having to deal with instances being fired back up if no one has hit your servers in 15 minutes. Since the app is new, or just has few users, there will be periods of great latency for some users who have no clue that instances are being "spun up"
As far as I see it you have these options based on the docs:
Use manual-scaling
and set the number of instances to 1
.
When you use manual-scaling
, whatever number of instances you set it to is what you will have - no more, no less. This is clearly inefficient as you may be paying for unused instances and instances are not automatically added/removed as traffic increases/decreases
Use basic-scaling
and set idle-timeout
to something like 24hrs or 48hrs.
This would keep your instance running as long as someone queries your API at least once within that time period.
Use automatic-scaling
with min-idle-instances
and warm-up requests enabled.
This does not work as intended. According to these docs:
if your app is serving no traffic, the first request to the app will
always be a loading request, not a warmup request.
This does not solve our problem because if zero instances are running, then there is nothing to warm-up in the first place. Thus you still get latency on the first request.
The desired effect I would like to have is to always have an instance running and then scale up from there if traffic is increased (and of course scale down but never go below one instance). It would be like automatic-scaling but with 1 instance always running.
Is this possible in GAE? Or am I missing something?
For now, my temporary solution is to set my app to manual-scaling
with 1 instance so at least my app is useable to new users.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…